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Abstract 

C/3 ! 

O . In this paper, we consider the Gaussian multiple-input multiple-output (MIMO) 

multi-receiver wiretap channel in which a transmitter wants to have confidential com- 
. munication with an arbitrary number of users in the presence of an external eaves- 

. dropper. We derive the secrecy capacity region of this channel for the most general 

O . case. We first show that even for the single-input single-output (SISO) case, existing 

. converse techniques for the Gaussian scalar broadcast channel cannot be extended to 

■ this secrecy context, to emphasize the need for a new proof technique. Our new proof 

■ technique makes use of the relationships between the minimum-mean-square-error and 
the mutual information, and equivalently, the relationships between the Fisher infor- 

: H ' mation and the differential entropy. Using the intuition gained from the converse proof 

■ of the SISO channel, we first prove the secrecy capacity region of the degraded MIMO 
^ channel, in which all receivers have the same number of antennas, and the noise co- 
variance matrices can be arranged according to a positive semi-definite order. We then 
generalize this result to the aligned case, in which all receivers have the same num- 
ber of antennas, however there is no order among the noise covariance matrices. We 
accomplish this task by using the channel enhancement technique. Finally, we find 
the secrecy capacity region of the general MIMO channel by using some limiting ar- 
guments on the secrecy capacity region of the aligned MIMO channel. We show that 
the capacity achieving coding scheme is a variant of dirty-paper coding with Gaussian 
signals. 
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1 Introduction 



Information theoretic secrecy was initiated by Wyner in his seminal work [1]. Wyner consid- 
ered a degraded wiretap channel, where the eavesdropper gets a degraded version of the le- 
gitimate receiver's observation. For this degraded model, he found the capacity-equivocation 
rate region where the equivocation rate refers to the portion of the message rate that can 
be delivered to the legitimate receiver, while the eavesdropper is kept totally ignorant of 
this part. Later, Csiszar and Korner considered the general wiretap channel, where there is 
no presumed degradation order between the legitimate user and the eavesdropper [2] . They 
found the capacity-equivocation rate region of this general, not necessarily degraded, wiretap 
channel. 

In recent years, information theoretic secrecy has gathered a renewed interest, where most 
of the attention has been devoted to the multiuser extensions of the wiretap channel, see 
for example [3-21]. One natural extension of the wiretap channel to the multiuser setting 
is the problem of secure broadcasting. In this case, there is one transmitter which wants 
to communicate with several legitimate users in the presence of an external eavesdropper. 
Hereafter, we call this channel model the multi-receiver wiretap channel. Finding the secrecy 
capacity region for the most general form of this channel model seems to be quite challenging, 
especially if one remembers that, even without the eavesdropper, we do not know the the 
capacity region for the underlying channel, which is a general broadcast channel with an 
arbitrary number of users. However, we know the capacity region for some special classes of 
broadcast channels, which suggests that we might be able to find the secrecy capacity region 
for some special classes of multi-receiver wiretap channels. This suggestion has been taken 
into consideration in [8-11]. In particular, in [9-11], the degraded multi-receiver wiretap 
channel is considered, where there is a certain degradation order among the legitimate users 
and the eavesdropper. The corresponding secrecy capacity region is derived for the two-user 
case in [9], and for an arbitrary number of users in [10, 11]. The importance of this class lies 
in the fact that the Gaussian scalar multi-receiver wiretap channel belongs to this class. 

In this work, wc start with the Gaussian scalar multi-receiver wiretap channel, and find 
its secrecy capacity region. Although, in the later parts of the paper, we provide the secrecy 
capacity region of the Gaussian multiple-input multiple-output (MIMO) multi-receiver wire- 
tap channel which subsumes the scalar case, there are two reasons for the presentation of 
the scalar case separately. The first one is to show that, existing converse techniques for 
the Gaussian scalar broadcast channel, i.e., the converse proofs of Bergmans [22] and El 
Gamal [23], cannot be extended in a straightforward manner to provide a converse proof 
for the Gaussian scalar multi-receiver wiretap channel. We explicitly show that the main 
ingredient of these two converses in [22,23], which is the entropy-power inequality [24,25], 
is not sufficient to conclude a converse for the secrecy capacity region. The second reason 
for the separate presentation is to present the main ingredients of the technique that we 
will use to provide a converse proof for the general MIMO channel in an isolated manner 
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in a simpler context. We provide two converse proofs for the Gaussian scalar multi-receiver 
wiretap channel. The first one uses the connection between the minimum-mean-square-error 
(MMSE) and the mutual information along with the properties of the MMSE [26,27]. In 
additive Gaussian channels, the Fisher information, another important quantity in estima- 
tion theory, and the MMSE have a complementary relationship in the sense that one of 
them determines the other one, and vice versa [28]. Thus, the converse proof relying on 
the MMSE has a counterpart which replaces the Fisher information with the MMSE in the 
corresponding converse proof. Hence, the second converse uses the connection between the 
Fisher information and the differential entropy via the de Bruin identity [24, 25] along with 
the properties of the Fisher information. This reveals that either the Fisher information ma- 
trix or the MMSE matrix should play an important role in the converse proof of the MIMO 
case. 

Keeping this motivation in mind, we consider the Gaussian MIMO multi-receiver wiretap 
channel next. Instead of directly tackling the most general case in which each receiver 
has an arbitrary number of antennas and an arbitrary noise covariance matrix, we first 
consider two sub-classes of MIMO channels. In the first sub-class, all receivers have the 
same number of antennas and the noise covariance matrices exhibit a positive semi-definite 
order, which implies the degradedness of these channels. Hereafter, we call this channel 
model the degraded Gaussian MIMO multi-receiver wiretap channel. In the second sub-class, 
although all receivers still have the same number of antennas as in the degraded case, the 
noise covariance matrices do not have to satisfy any positive semi-definite order. Hereafter, 
we call this channel model the aligned Gaussian MIMO multi-receiver wiretap channel. Our 
approach will be to first find the secrecy capacity region of the degraded case, then to 
generalize this result to the aligned case by using the channel enhancement technique [29] . 
Once we obtain the secrecy capacity region of the aligned case, we use this result to find the 
secrecy capacity region of the most general case by some limiting arguments as in [29,30]. 
Thus, the main contribution and the novelty of our work is the way we prove the secrecy 
capacity region of the degraded Gaussian MIMO multi-receiver wiretap channel, since the 
remaining steps from then on are mainly adaptations of the existing proof techniques [29,30] 
to an eavesdropper and/or multiuser setting. 

At this point, to clarify our contributions, it might be useful to note the similarity of 
the proof steps that we follow with those in [29] , where the capacity region of the Gaussian 
MIMO broadcast channel was established. In [29] also, the authors considered the degraded, 
the aligned and the general cases successively. Although, both [29] and this paper has 
these same proof steps, there are differences between how and why these steps arc taken. 
In [29] , the main difficulty in obtaining the capacity region of the Gaussian MIMO broadcast 
channel was to extend Bergmans' converse for the scalar case to the degraded vector channel. 
This difficulty was overcome in [29] by the invention of the channel enhancement technique. 
However, as discussed earlier, Bergmans' converse cannot be extended to our secrecy context. 
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even for the degraded scalar case. Thus, we need a new technique which we construct by 
using the Fisher information matrix and the generahzed de Bruin identity [31]. After we 
obtain the secrecy capacity region of the degraded MIMO channel, we adapt the channel 
enhancement technique to our setting to find the secrecy capacity region of the aligned 
MIMO channel. The difference of the way channel enhancement is used here as compared to 
the one in [29] comes from the presence of an eavesdropper, and its difference from the one 
in [30] is due to the presence of many legitimate users. After we find the secrecy capacity 
region of the aligned MIMO channel, we use the limiting arguments that appeared in [29,30] 
to prove the secrecy capacity region of the general MIMO channel. 

The single user version of the Gaussian MIMO multi-receiver wiretap channel we study 
here, i.e., the Gaussian MIMO wiretap channel, was solved by [32,33] for the general case 
and by [34] for the 2-2-1 case. Their common proof technique was to derive a Sato- type outer 
bound on the secrecy capacity, and then to tighten this outer bound by searching over all 
possible correlation structures among the noise vectors of the legitimate user and the eaves- 
dropper. Later, [30] gave an alternative, simpler proof by using the channel enhancement 
technique. 

2 Multi-Receiver Wiretap Channels 

In this section, we first revisit the multi-receiver wiretap channel. The general multi-receiver 
wiretap channel consists of one transmitter with an input alphabet X, K legitimate receivers 
with output alphabets 3^^, k = 1, . . . ,K, and an eavesdropper with output alphabet Z. The 
transmitter sends a confidential message to each user, say Wk G Wfc to the A;th user, and all 
messages are to be kept secret from the eavesdropper. The channel is memoryless with a 
transition probability p{yi,y2, ■ ■ ■ ,yK, z\x). 

A (2"^i, . . . , 2"^^, n) code for this channel consists of K message sets, Wk = {1, . . . , 2"^*=}, 
k — 1,. . . ,K, an encoder / : Wi x . . . x Wk — K decoders, one at each legit- 
imate receiver, Qk '■ yk ^ Wk, k — 1,. . . ,K. The probabihty of error is defined as 
Pg" = maxjk=i^...^x Pr [5'fe(Y'fc"') 7^ (W^fe)]- A rate tuple {Ri,...,Rk) is said to be achievable 
if there exists a code with hm„_^oo Pe — 



where S{W) denotes any subset of {VTi, . . . , Wk}- Hence, we consider only perfect secrecy 
rates. The secrecy capacity region is defined as the closure of all achievable rate tuples. 
The degraded multi-receiver wiretap channel exhibits the following Markov chain 




(1) 
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whose capacity region was established in [10,11] for an arbitrary number of users and in [9] 
for two users. 

Theorem 1 The secrecy capacity region of the degraded multi-receiver wiretap channel is 
given by the union of the rate tuples . . . , Rk) satisfying 

Rk<IiUk;Yk\Uk+i,Z), k = l,...,K (3) 

where Ui = X, Uk+i = (t>, o,nd the union is over all probability distributions of the form 

p{uk)p{uk~i\uk) ■ ■ ■p{u2\uz)p{x\u2) (4) 

We remark here that since the channel is degraded, i.e., we have the Markov chain in ([2]), 
the capacity expressions in ([3]) are equivalent to 

Rk<I{Uk-MUk+i)-I{Uk;Z\Uk+i), k=l,...,K (5) 

We will use this equivalent expression frequently hereafter. For the case of two users and 
one eavesdropper, i.e., K = 2, the expressions in ([5]) reduce to: 

R^<I{X■,Y^\U2)-I{X■,Z\U2) (6) 
R2<I{U2;Y2)-I{U2;Z) (7) 

Finding the secrecy capacity region of the two-user degraded multi-receiver wiretap channel 
is tantamount to finding the optimal joint distributions of (X, U2) that trace the boundary of 
the secrecy capacity region given in ([6])-([7]). For the K-user degraded multi-receiver wiretap 
channel, we need to find the optimal joint distributions of (X, U2, ■ ■ ■ , Uk) in the form given 
in (jlj) that trace the boundary of the region expressed in ([3]). 



3 Gaussian MIMO Multi-receiver Wiretap Channel 

3.1 Degraded Gaussian MIMO Multi-receiver Wiretap Channel 

In this paper, we first consider the degraded Gaussian MIMO multi-receiver wiretap channel 
which is defined through 

Yfc = X + N,, k = l,...,K (8) 
Z = X + N2 (9) 

^Although in [10, 11], this secrecy capacity region is expressed in a different form, the equivalence of the 
two expressions can be shown. 
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where the channel input X is subject to a covariance constraint 

E [XX^] < S (10) 

where S 0, and {Nfc}^^,N^ are zero-mean Gaussian random vectors with covariance 
matrices {^k}k=ii which satisfy the following ordering 

^ El ^ S2 ^ . . . ^ ^ (11) 

In a multi-receiver wiretap channel, since the capacity-equivocation rate region depends 
only on the conditional marginal distributions of the transmitter-receiver links, but not on 
the entire joint distribution of the channel, the correlations among {^k}k=ii have no 
consequence on the capacity-equivocation rate region. Thus, without changing the corre- 
sponding secrecy capacity region, we can adjust the correlation structure among these noise 
vectors to ensure that they satisfy the following Markov chain 

X ^ Yi ^ . . . ^ ^ Z (12) 

which is always possible because of our assumption regarding the covariance matrices in ( fTTl) . 
Moreover, the Markov chain in ( fT2l) implies that any Gaussian MIMO multi-receiver wiretap 
channel satisfying the semi-definite ordering in (fTTl) can be treated as a degraded multi- 
receiver wiretap channel, hence Theorem [1] gives its capacity region. Hereafter, we will 
assume that the degraded Gaussian MIMO wiretap channel satisfies the Markov chain in 

3.2 Aligned Gaussian MIMO Multi-receiver Wiretap Channel 

Next, we consider the aligned Gaussian MIMO multi-receiver wiretap channel which is again 
defined by (EI)-©, and the input is again subject to a covariance constraint as in (ITOl) with 
S 0. However, for the aligned Gaussian MIMO multi-receiver wiretap channel, noise 
covariance matrices do not have any semi-definite ordering, as opposed to the degraded 
case which exhibits the ordering in ffTTl) . For the aligned Gaussian MIMO multi-receiver 
wiretap channel, the only assumption on the noise covariance matrices is that they are 
strictly positive-definite, i.e., Sj :^ 0, i = 1, . . . , K and >- 0. Since this channel does not 
have any ordering among the noise covariance matrices, it cannot be considered as a degraded 
channel, thus there is no single-letter formula for its secrecy capacity region. Moreover, we do 
not expect superposition coding with stochastic encoding to be optimal, as it was optimal for 
the degraded channel. Indeed, we will show that dirty-paper coding with stochastic encoding 
is optimal in this case. 
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3.3 General Gaussian MIMO Multi-receiver Wiretap Channel 

Finally, we consider the most general form of the Gaussian MIMO multi-receiver wiretap 
channel which is given by 

Yfc = HfcX + Nfc, k = l,...,K (13) 
Z = H^X + Nz (14) 

where the channel input X, which is a t x 1 column vector, is again subject to a covariance 
constraint as in ( ITOl) with S ^ 0. The channel output for the kth user is denoted by 
which is a column vector of size x 1, k = 1, . . . , K. The eavesdropper's observation Z is 
of size rz x 1. The covariance matrices of the Gaussian random vectors {Nk}k=i are 
denoted by {'Sk}k=i 5 "^z, which are assumed to be strictly positive definite. The channel 
gain matrices {Hfe}^^ , Hz are of sizes {r^ x t}^^ ,rz xt, respectively, and they are known 
to the transmitter, all legitimate users and the eavesdropper. 



3.4 A Comment on the Covariance Constraint 

In the literature, it is more common to define capacity regions under a total power con- 
straint, i.e., tr (^E [XX^]) < P, instead of the covariance constraint that we imposed, i.e., 
E [XX^] ^ S. However, as shown in [29], once the capacity region is obtained under a co- 
variance constraint, then the capacity region under more lenient constraints on the channel 
inputs can be obtained, if these constraints can be expressed as compact sets defined over 
the input covariance matrices. For example, the total power constraint and the per-antenna 
power constraint can be described by compact sets of input covariance matrices as follows 

^total _ |g ^ Q . ^^(g) < p| ^^5) 
^per-ant = {S ^ : S,, < P„ 2 = 1, . . . , t} (16) 

respectively, where Sa is the ith diagonal entry of S, and t denotes the number of transmit 
antennas. Thus, if the secrecy capacity region under a covariance constraint E [XX"^] ^ S 
is found and denoted by C(S), then the secrecy capacity regions under the total power 
constraint and the per-antenna power constraint can be expressed as 

C*°*"'= U C(S) (17) 

Sg^total 

respectively. 

One other comment about the covariance constraint on the channel input is regarding 
the positive definiteness of S. Following Lemma 2 of [29], it can be shown that, for any 
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degraded (resp. aligned) Gaussian MIMO multi-receiver channel under a covariance con- 
straint E [XX^] ^ S where S is a non-invertible positive semi-definite matrix, i.e., S ^ 
and |S| = 0, we can find another equivalent degraded (resp. aligned) channel with fewer 
transmit and receive antennas under a covariance constraint E [XX^] ^ S' such that S' 0. 
Here the equivalence refers to the fact that both of these channels will have the same secrecy 
capacity region. Thus, as long as a degraded or an aligned channel is considered, there is no 
loss of generality in imposing a covariance constraint with a strictly positive definite matrix 
S, and this is why we assumed that S is strictly positive definite for the degraded and the 
aligned channels. 



4 Gaussian SISO Multi-receiver Wiretap Channel 

We first visit the Gaussian SISO multi-receiver wiretap channel. The aims of this section are 
to show that a straightforward extension of existing converse techniques for the Gaussian 
scalar broadcast channel fails to provide a converse proof for the Gaussian SISO multi- 
receiver wiretap channel, and to provide an alternative proof technique using either the 
MMSE or the Fisher information along with their connections with the differential entropy. 
To this end, we first define the Gaussian SISO multi-receiver wiretap channel 

Yk = X + Nk, k = l,2 (19) 
Z = X + Nz (20) 

where we also restrict our attention to the two-user case for simplicity of the presentation. 
The channel input X is subject to a power constraint E [X^] < P. The variances of the 
zero-mean Gaussian random variables Ni,N2,Nz are given by ct^,(t|,ct|, respectively, and 
satisfy the following order 

al < al < 4 (21) 

Since the correlations among Ni, N2, Nz have no effect on the secrecy capacity region, we 
can adjust the correlation structure to ensure the following Markov chain 

X -^Yi-^Y2-* Z (22) 

Thus, this channel can be considered as a degraded channel, and its secrecy capacity region 
is given by Theorem [1], in particular, by ([6]) and ([7]). Hence, to compute the secrecy capacity 
region explicitly, we need to find the optimal joint distributions of (X, f/2) in ([6]) and ([7]). 
The corresponding secrecy capacity region is given by the following theorem. 

Theorem 2 The secrecy capacity region of the two-user Gaussian SISO wiretap channel is 
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given by the union of the rate pairs {Ri,R2) satisfying 




aP 



(23) 



(24) 



where the union is over all a G [0, 1], and a denotes 1 — a. 

The achievability of this region can be shown by selecting (X, U2) to be jointly Gaussian 
in Theorem [H We focus on the converse proof. 

4.1 Insufficiency of the Entropy-Power Inequality 

As a natural approach, one might try to adopt the converse proofs of the scalar Gaussian 
broadcast channel for the converse proof of Theorem [2l In the literature, there are two 
converses for the Gaussian scalar broadcast channel which share some main principles. The 
first converse was given by Bergmans [22] who used Fano's lemma in conjunction with the 
entropy-power inequality [24,25] to find the capacity region. Later, El Gamal gave a relatively 
simple proof [23] which does not recourse to Fano's lemma. Rather, he started from the 
single-letter expression for the capacity region and used entropy-power inequality [24,25] to 
evaluate this region. Thus, the entropy-power inequality [24, 25] is the main ingredient of 
these converses. 

We now attempt to extend these converses to our secrecy context, i.e., to provide the 
converse proof of Theorem [2l and show where the argument breaks. In particular, what 
we will show in the following discussion is that a stand-alone use of the entropy-power in- 
equality [24, 25] falls short of proving the optimality of Gaussian signalling in this secrecy 
context, as opposed to the Gaussian scalar broadcast channel. For that purpose, we con- 
sider El Gamal's converse for the Gaussian scalar broadcast channel. However, since the 
entropy-power inequality is in a central role for both El Gamal's and Bergmans' converse, 
the upcoming discussion can be carried out by using Bergmans' proof as well. 

First, we consider the bound on the second user's secrecy rate. Using (JTj), we have 



/(f/2; Y2) - nU^; Z) = [I{X- Y2) - I{X- Z)] - [I{X- Y2\U2) - /(X; Z\U2)\ (25) 



where the right-hand side is obtained by using the chain rule, and the Markov chain U2 
X {Yi,Y2,Z). The expression in the first bracket is maximized by Gaussian X [35] 
yielding 



Moreover, using the Markov chain U2 ^ X ^ Y2 ^ Z ^ v4e can bound the expression in the 




(26) 
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second bracket as 



Q<I{X-Y2\U2)-I{X-Z\U2) (27) 
<I{X;Y2)-I{X-Z) (28) 

1, A P\ 1. A P 



<-log(l + -2 --log 1 + -^) (29) 



'2/ ^ \ ^ Z 

which imphes that for any (X, U2) pair, there exists an a G [0, 1] such that 
Combining ( 126|) and ( 130|) in ( l25l) yields the desired bound on R2 given in ([2 



From now on, we focus on obtaining the bound given in ([23 
rate. To this end, one needs to solve the following optimizatio: 



on the first user's secrecy 



max I{X;Yi\U2) - I{X;Z\U2) (31) 



1 , aP\ 1 , aP 

'z 



s.t. /(X;F2|t/2)-/(X;Z|f/2) = -log(l + ^l --logM +^) (32) 



When the term /(X; Z\U2) is absent in both the objective function and the constraint, as in 
the case of the Gaussian scalar broadcast channel, the entropy-power inequality [24,25] can 
be used to solve this optimization problem. However, the presence of this term complicates 
the situation, and a stand-alone use of the entropy-power inequality [24, 25] does not seem 
to be sufficient. To substantiate this claim, let us consider the objective function in (!3Ti) 



I{X-Y^\U2)-I{X-Z\U2) = h{Y^\U2) - h{Z\U2) - J log 4 (33) 

2 az 

< I log (e^'^^^'^^) - 27re (4 - al) ) - h{Z\U2) " ^ log ^ (34) 

where the inequality is obtained by using the entropy-power inequality. Since the right- 
hand side of is monotonically increasing in h{Z\U2), to show the optimality of Gaussian 
signalling, we need 

h{Z\U2) <^log27Te{aP + al) (35) 



^Equivalently, one can consider the following optimization problem 
max I{X;Yi\U2)-IiX;Y2\U2) 

s.t. nX: Y2\U2) - I{X: Z\U2) = ^ log + ^) " ^ log (l + ^ 
which, in turn, would yield a similar contradiction. 
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which will result in the desired bound on (13T!) . i.e., the following 

I{X; Y,\U2) - I{X; Z\U2) ^ 2 r + ^ ) ~ 2 V + ^ ) ^^^^ 

which is the desired end-result in ( l23l) . 

We now check whether (135|1 holds under the constraint given in ( l32l) . To this end, consider 
the difference of mutual informations in (132|) 

J(X; F2|f/2) - /(X; = h{Y2\U2) - h{Z\U2) - ^ log 4 (37) 

< \ log (e^'^^^l^^) - 27re (4 - al) ) - h{Z\U2) " ^ log ^ (38) 

z 

where the inequality is obtained by using the entropy-power inequality. Now, using the 
constraint given in fl5^ in (1551) . we get 

which implies 

Uog27ie{aP + al)<h{Z\U2) (40) 

Thus, as opposed to the inequality that we need to show the optimality of Gaussian signalling 
via the entropy-power inequality, i.e., the bound in fl35l) . we have an opposite inequality. This 
discussion reveals that if Gaussian signalling is optimal, then its proof cannot be deduced 
from a straightforward extension of the converse proofs for the Gaussian scalar broadcast 
channel in [22,23]. Thus, we need a new technique to provide the converse for Theorem O 
We now present two different proofs. The first proof relies on the relationship between the 
MMSE and the mutual information along with the properties of the MMSE, and the second 
proof replaces the MMSE with the Fisher information. 

4.2 Converse for Theorem [2] Using the MMSE 

We now provide a converse which uses the connection between the MMSE and the mutual 
information established in [26,27]. In [27], the authors also give an alternative converse for 
the scalar Gaussian broadcast channel. Our proof will follow this converse, and generalize it 
to the context where there are secrecy constraints. 

First, we briefly state the necessary background information. Let be a zero-mean unit- 
variance Gaussian random variable, and ([/, X) be a pair of arbitrarily correlated random 
variables which are independent of A^. The MMSE of X when it is observed through U and 
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ViX + N i 



IS 



mmse(X,t|f/) = E 



(41) 



(^X-E X\ViX + N,U 

As shown in [26,27], the MMSE and the conditional mutual information are related through 

1 /■* 

I{X; VtX + N\U) = - / mmse(X, t\U)dt (42) 

2 Jo 

For our converse, we need the following proposition which was proved in [27]. 
Proposition 1 ([27], Proposition 12) Let U,X,N be as specified above. The function 

2 

= -mmse(X,t|f/) (43) 

+ 1 

has at most one zero in [0, oo) unless X is Gaussian conditioned on U with variance a"^ , in 
which case the function is identically zero on [0, oo). In particular, if to < oo is the unique 
zero, then f{t) is strictly increasing on [0,to]j one? strictly positive on {to, oo). 

We now give the converse. We use exactly the same steps from (l25i) to ( l30i) to establish 
the bound on the secrecy rate of the second user given in ( 12^ . To bound the secrecy rate of 
the first user, we first restate (!30|) as 

/(X; Y2\U2) - I{X; Zp^) = I{X; {l/a2)X + N\U2) - I{X; (l/a^)X + N\U2) (44) 

' dt (46) 



2 A/4 taP+1 



Furthermore, due to (H2l). we also have 



/(X; Y2\U2) - /(X; Z\U2) = /(X; {l/a2)X + N\U2) - /(X; (l/a^)X + N\U2) 

1 /'V'^i 

= -/ mmse{X, t\U2)dt (47) 

2 Ji/4 

Comparing fH^ and fH7|) reveals that either we have 

aP 

mmse{X,t\U2) = ——— (48) 
taP + 1 

for all t G [l/o"|, l/o"|], or there exists a unique to ^ (V^l; V^l) such that 

mmse(X,to|f/2) = -^^ (49) 
toaP + 1 
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and 



aP 

mmse(X,t|t/2) < (50) 
taF + 1 

for t > to, because of Proposition [H The former case occurs if X is Gaussian conditioned 
on U2 with variance aP, in which case we arrive at the desired bound on the secrecy rate of 
the first user given in fl23|) . If we assume that the latter case in fH9|) - fl50|) occurs, then, we 
can use the following sequence of derivations to bound the first user's secrecy rate 



I{X; Y,\U2) - I{X; Z\U2) = /(X; (l/y^)X + N\U2) - I{X; {l/y^)X + N\U2) (51) 

1 /•V'^? 

'1/4 

/ mmse{X,t\U2)dt + - / mmse(X, t|f/2)rft (53) 



2 



/ mmse(X,t|f/2)rft (52) 



1 , , , 1 /'^/'"i 

1 . aP\ 1 , aP 



log 








2 ^ 




log 


:(i. 



+ -/ mmse(X,t|f/2)rft (54) 
1 / aP\ 1 / aP\ 1 /"^/"i aP , 

(55) 

1 / aP\ 1 , / aP\ , , 

where ( 15^ follows from (H6ll and (H7|) . and (155!) is due to (1501) . Since (!56|) is the desired 
bound on the secrecy rate of the first user given in (123|) . this completes the converse proof. 



4.3 Converse for Theorem [2] Using the Fisher Information 

We now provide an alternative converse which replaces the MMSE with the Fisher informa- 
tion in the above proof. We first provide some basic definitions. The unconditional versions 
of the following definition and the upcoming results regarding the Fisher information can 
be found in standard detection-estimation texts; to note one, [36] is a good reference for a 
detailed treatment of the subject. 

Definition 1 Let X, U be arbitrarily correlated random variables with well-defined densities, 
and f{x\u) be the corresponding conditional density. The conditional Fisher information of 
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X is defined by 



J{X\U) = E 



d\ogf{x\u] 
dx 



(57) 



where the expectation is over {U,X). 



The vector generalization of the following conditional form of the Fisher information 
inequality will be given in Lemma [15] in Section [5^ thus its proof is omitted here. 

Lemma 1 Let U, X, Y be random variables, and let the density for any combination of them 
exist. Moreover, let us assume that given U, X and Y are independent. Then, we have 

J{X + Y\U) < (3^J{X\U) + (1 - (3)^J{Y\U) (58) 

for any [5 G [0, 1]. 

Corollary 1 Let X, F, U be as specified above. Then, we have 

1 > 1 I 1 (59) 

J{X + Y\U) - J{X\U) J{Y\U) ^ ' 

Proof: Select 

P= (60) 

in the previous lemma. ■ 

Similarly, the vector generalization of the following conditional form of the Cramer-Rao 
inequality will be given in Lemma [13] in Section [5^ and hence, its proof is omitted here. 

Lemma 2 Let X, U be arbitrarily correlated random variables with well-defined densities. 
Then, we have 

with equality if {U, X) is jointly Gaussian. 

We now provide the conditional form of the de Bruin identity [24,25]. The vector gener- 
alization of this lemma will be provided in Lemma [TH] in Section 15.41 and hence, its proof is 
omitted here. 

Lemma 3 Let X, U be arbitrarily correlated random variables with finite second order mo- 
ments. Moreover, assume that they are independent of N which is a zero-mean unit-variance 
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Gaussian random variable. Then, we have 

''"(XY'Nm^lj(X + VtN\U) (62) 

We now note the following complementary relationship between the MMSE and the 
Fisher information [26, 28] 

J{VtX + N) = l-t- mmse(X, t) (63) 

which itself suggests the existence of an alternative converse which uses the Fisher informa- 
tion instead of the MMSE. We now provide the alternative converse based on the Fisher 
information. We first bound the secrecy rate of the second user as in the previous section, 
by following the exact steps from (l25l) to (!30i) . To bound the secrecy rate of the first user, 
we first rewrite fl30|) as follows 

/(X; Y2\U2) - I{X; Z\U2) = h{X + a2N\U2) - h{X + azN\U2) - J log 4 (64) 

2 az 

= r J{X + VtN\U2)dt - ^ log 4 (65) 



\ [ ' J{X + Vt^N' + Vt^N"\U2)dt - I log 



a. 



CTr, 



(66) 



where (1651) follows from Lemma [3l and in (|66l) . we used the stability of Gaussian random 
variables where, A^', A^" are two independent zero-mean unit- variance Gaussian random vari- 
ables. Moreover, t* is selected in the range of (0,cr|). We now use Corollary [1] to bound the 
conditional Fisher information in (l66ll as follows 



> .„.. ' («T) 



J{X + y/t^N' + Vt*N"\U2) ~ J{X + Vt*N"\U2) JiVt^N'\U2 

+ {t- 1*) {Qi 



J{X + V¥N"\U2) 

where the equality follows from Lemma [2l The inequality in (168|) is equivalent to 



J{X + Vt^N' + V¥n"\U2) < -n^^^''' I^^J (69) 



J(X + v^A^"| 


\U2) 


1 + j{x + VFn"\ 


\U2){t-t*) 
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using which in (166!) yields 



1 JiX + ^/t*N"\Uo) , I, ai 



J{X + Vt*N"\ 


\U2) 


1 + J{X + Vt*N"\ 


mit-t*) 



i{x-Y2\u2)-i{x-z\u2)>-^ I ' ^"^^ ;„',,' 7/, — zr^^-^^og^ (^o) 



'z 



2 



^ -1 log 1 + -^(-^ + VPN"\U,)(al -t') _l 2^ 
We remind that we had aheady fixed the left-hand side of this inequality as 

/(X; Y2\U2) - I{X; Zp^) = ^ + ^) " ^ (l + ^) (^2) 

in (130|) . Comparison of (17T|) and (1721) results in 

J(X + Vt*Ar"|t/9) > — , 0<t* <al (73) 



At this point, we compare the inequalities in (150|) and (173|) . These two inequahties imply 
each other through the complementary relationship between the MMSE and the Fisher 
information given in ( l63i) after appropriate change of variables and by noting that J{aX) = 
(l/a^)J(X) [36]. We now find the desired bound on the secrecy rate of the first user via 
using the inequality in (173|) 



/(X; Fi|f/2) - /(X; Z\U2) = h{X + aiN\U2) - h{X + azN\U2) - ^ log 4 (^4) 

= -\ ! ' J{X + VtN\U2)dt-\\og^ (75) 
= -]- [ ' J{X + VtN\U2)dt -]- [ ' J{X + VtN\U2)dt 
-^log^ (76) 

Z ^z 

-ilog4 (T7) 

Z ^z 

1 1 , 1, faP + al 



<-o/ -^--c?t--log „ , ^ -T^log4- (7 



(XT 



2 J^2 aP + t 2 ° \ aP + ai J 2 Oz 
— log 1 - - log 1 - - log ^ 79 

1 , / aP\ 1 / aP\ , , 

2'°<l + 7f)-2'°H'-'4) 
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where ( I77I) follows from (165!) and (172|) . and (178!) is due to (173|) . Since (IHOl) provides the desired 
bound on the secrecy rate of the first user given in (123|) . this completes the converse proof. 

4.4 Summary of the SISO Case, Outlook for the MIMO Case 

In this section, we first revisited the standard converse proofs [22,23] of the Gaussian scalar 
broadcast channel, and showed that a straightforward extension of these proofs will not 
be able to provide a converse proof for the Gaussian SISO multi-receiver wiretap channel. 
Basically, a stand-alone use of the entropy-power inequality [24, 25] falls short of resolving 
the ambiguity on the auxiliary random variables. We showed that, in this secrecy context, 
either the connection between the mutual information and the MMSE or the connection 
between the differential entropy and the Fisher information can be used, along with their 
properties, to come up with a converse. 

In the next section, we will generalize this converse proof technique to the degraded 
MIMO channel. One way of generalizing this converse technique to the MIMO case might 
be to use the channel enhancement technique, which was successfully used in extending 
Bergmans' converse proof from the scalar Gaussian broadcast channel to the degraded vector 
Gaussian broadcast channel. We note that such an extension will not work in this secrecy 
context. In the degraded Gaussian MIMO broadcast channel, the non-trivial part of the 
converse proof was to extend Bergmans' converse to a vector case, and this was accomplished 
by the invention of the channel enhancement technique. However, as we have shown in 
Section 14.11 even in the Gaussian SISO multi-receiver wiretap channel, a Bergmans type 
converse does not work. Therefore, we will not pursue a channel enhancement approach to 
extend our proof from the SISO channel to the degraded MIMO channel. Instead, we will 
use the connections between the Fisher information and the differential entropy, as we did 
in Section 14.31 to come up with a converse proof for the degraded MIMO channel. We will 
then use the channel enhancement technique to extend our converse proof to the aligned 
MIMO channel. Finally, we will use some limiting arguments, as in [29,30], to come up with 
a converse proof for the most general MIMO channel. 

5 Degraded Gaussian MIMO Multi-receiver Wiretap 
Channel 

In this section, we establish the secrecy capacity region of the degraded Gaussian MIMO 
multi-receiver wiretap channel. We state the main result of this section in the following 
theorem. 

Theorem 3 The secrecy capacity region of the degraded Gaussian MIMO multi-receiver 
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wiretap channel is given by the union of the rate tuples Ri, 



, Rk satisfying 



Rk < - loe 



Ell K. + 



- log 

2 ^ 



Ell K. + 



E 



1=1 * 



k = l K 



(81) 



where the union is over all positive semi- definite matrices {Ki}fLi that satisfy 



K 



(82) 



1=1 



The achievability of these rates follows from Theorem[T]by selecting {Uk-, • • • , t^2, X) to be 
jointly Gaussian. Thus, to prove the theorem, we only need to provide a converse. Since the 
converse proof is rather long and involves technical digressions, we first present the converse 
proof for K = 2. In this process, we will develop all necessary tools which we will use to 
provide the converse proof for arbitrary K in Section [531 

The secrecy capacity region of the two-user degraded MIMO channel, from flHTl) . is the 
union of the rate pairs R2) satisfying 



Ri< - log 

1 _ 2 f 



iKi + Sil 
|Si| 
1 IS + S2I 

Ro < - log ■; r 

'-2 ^Ki + S2 



1, |Ki + S^| 

- log — — — — 

1, |S + Sz| 

- log ■; r 

2 ^Ki + Sz 



(83) 
(84) 



where the union is over all selections of Ki that satisfies ^ Ki ^ S. We note that these 
rates are achievable by choosing X = U2+V in Theorem [1], where U2 and V are independent 
Gaussian random vectors with covariance matrices S — Ki and Ki, respectively. Next, we 
prove that the union of the rate pairs in ( |83l) and ( |8^ constitute the secrecy capacity region 
of the two-user degraded MIMO channel. 



5.1 Proof of Theorem [3] for A' = 2 



To prove that (15^ and give the secrecy capacity region, we need the results of some 
intermediate optimization problems. The first one is the so-called worst additive noise 
lemma [37,38]. 

Lemma 4 Let N be a Gaussian random vector with covariance matrix S, and K.x be a 
positive semi-definite matrix. Consider the following optimization problem, 



min /(N; N + X) 

p(x) 

s.t. Cov(X) = Kx 



(85) 
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where X and N are independent. A Gaussian X is the minimizer of this optimization 
problem. 

The second optimization problem that will be useful in the upcoming proof is the condi- 
tional version of the following theorem. 

Theorem 4 Let X, Ni, N2, he independent random vectors, where Ni, N2, are zero- 
mean Gaussian random vectors with covariance matrices ^ Si ^ S2 ^ S^, respectively. 
Moreover, assume that the second moment of~K. is constrained as 

E [XX^] ^ S (86) 

where S is a positive definite matrix. Then, for any admissible X, there exists a matrix K* 
such that ^ K* ^ S, and 

MX + N^) - h{X + N2) = ^ log ^^^1^ (87) 
h{X + Nz) - h{X + NO > ^ log |^*^^^| (88) 

The conditional version of Theorem H] is given as follows. 

Theorem 5 Let U, X be arbitrarily correlated random vectors which are independent of 
Ni, N2, N^, where Ni, N2, are zero-mean Gaussian random vectors with covariance ma- 
trices -< Si ^ S2 ^ S^, respectively. Moreover, assume that the second moment o/X is 
constrained as 

E [XJC] ^ S (89) 

where S is a positive definite matrix. Then, for any admissible (U, X) pair, there exists a 
matrix K* such that ^ K* ^ S, and 

hiX + Nz\V) - hiX + N2IU) = ^ log ^^^1^ (90) 
h{X + Nz\V) - h{X + Ni|U) > i log ^^1^ (91) 

Theorem H] serves as a step towards the proof of Theorem O Proofs of these two theorems 
are deferred to Sections 15.31 and 15. 4[ 

We are now ready to show that the secrecy capacity region of the two-user degraded 
MIMO channel is given by (1831) - (1841) . We first consider R2, and bound it using Theorem [T] 
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as follows 



i?2</(t/2;Y2)-/(f/2;Z) (92) 
= [/(X; Y2) - /(X; Z)] - [/(X; Y2|f/2) - /(X; Z\U2)] (93) 

where the equality is obtained by using the chain rule and the Markov chain f/2 ^ X ^ 
(Y2, Z). We now consider the expression in the first bracket of fl93|) 

/(X; Y2) - /(X; Z) = h{Y2) - /^(YalX) - h{Z) + /i(Z|X) (94) 
= /i(Y2)-/i(Z)-ilogM (95) 



2 ° S 



z\ 



where the second equality follows from the facts that /i(Y2|X) = h{N2) and /i(Z|X) = 
h{Nz)- We now consider the difference of differential entropies in (195!) . To this end, consider 
the Gaussian random vector N2 with covariance matrix Hz ~ S2, which is chosen to be 
independent of X, N2. Using the Markov chain in f|T2l) . we get 

h{Y2) - h{Z) = h{Y2) - /i(Y2 + N2) (96) 

= -/(N2;Y2 + N2) (97) 

I1 IK + S2I 
< max - log ■— — — — (98) 
-o^K^s2 ^|K + Sz| ^ ^ 

1 IS + E2I 

= 2^°S^TS^ ^''^ 

where ( l98l) follows from Lemma H] and fl99|) is a consequence of the fact that 

|B| < + ,100) 



|A + B| - |A + B + A 
when A,B,A^O, andA + B^O [29]. Plugging ([99]) into ([95]) yields 

/(X; Y2) - /(X; Z) < i log - log (101) 

We now consider the expression in the second bracket of fl^^ . For that purpose, we use 
Theorem[5l According to Theorem[5l for any admissible pair (f/2,X), there exists a K* such 
that 

/i(X + Nz\U2) - h{X + N2\U2) = l log ^^l^^^l (102) 
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which is equivalent to 

/(X; Z\U,) - /(X; Y^p^) = ^ log f^^' - ^ log ^^^^^ (103) 

^ ^ l^2| 

Thus, using (fTUTD and (fTO in ([23]), we get 

1 IS + S2I |S + S^| 

which is the desired bound on R2 given in (18^ . We now obtain the desired bound on Ri 
given in (!83l) . To this end, we first bound Ri using Theorem [1] 

i?i < /(X;Yi If/2) -/(X;Z If/2) (105) 
= /i(Yi|f/2) - /i(Yi|f/2,X) - h{Z\U2) + /i(Z|f/2,X) (106) 

= h(Y,\U2) - MZ|f^2) - ^log^ (107) 

^ l^zl 

where the second equality follows from the facts that h{Yi\U2, X) = h(Ni) and h{Z\U2, X) = 
h(Nz)- To bound the difference of conditional differential entropies in f ll07p . we use Theo- 
rem [5l Theorem [5] states that for any admissible pair (f/2, X), there exists a matrix K* such 
that it satisfies (11021) and also 

MZ|f/2) -MYilf/2) >^log l^^^^^l (108) 

Thus, using ffTOHD in ffTOTD . we get 

1, |K* + Si| 1 |K* + S2| , , 

Ri<- log ' ,J, ' - - log ' , J , ' 109 
2 ^ |Si| 2 ^ |Sz| 

which is the desired bound on Ri given in ( 183|) . completing the converse proof for K = 2. 

As we have seen, the main ingredient in the above proof was Theorem [5l Therefore, to 
complete the converse proof for the degraded channel for K = 2, from this point on, we will 
focus on the proof of Theorem [51 We will give the proof of Theorem [5] in Section I5.4[ In 
preparation to that, we will give the proof of Theorem [U which is the unconditional version 
of Theorem [5], in Section I5.3[ The proof of Theorem [H involves the use of properties of the 
Fisher information, and its connection to the differential entropy, which are provided next. 

5.2 The Fisher Information Matrix 

We start with the definition [36]. 

Definition 2 Let U be a length-n random vector with differentiable density /{/(u). The 
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Fisher information matrix o/U, J(U), is defined as 



J(U) = E [p(U)p(U)T (110) 
where p(u) is the score function which is given by 

dlogfu{n) dlogfu{u) 



p{u) = Vlog/t/(u) 



dui dur, 



(111) 



Since we are mainly interested in the additive Gaussian channel, how the Fisher information 
matrix behaves under the addition of two independent random vectors is crucial. Regarding 
this, we have the following lemma which is due to [39]. 

Lemma 5 ([39]) Let U be a random vector with differentiable density, and let Hu >~ be 
its covariance matrix. Moreover, let V be another random vector with differentiable density, 
and be independent o/U. Then, we have the following facts: 

1. Matrix form of the Cramer-Rao inequality 

J(U) ^ S^^ (112) 

which is satisfied with equality if U is Gaussian. 

2. For any square matrix A, 

J(U + V) ^ AJ(U)AT + (I - A)J(V)(I - A)^ (113) 

We will use the following consequences of this lemma. 
Corollary 2 Let U, V be as specified before. Then, 

1. J(U + V) ^ J(U) 

2. J(U + V)^ [j(U)-i + J(V)~i]-' 

Proof: The first part of the corollary is obtained by choosing A = I, and the second part is 
obtained by choosing 

A = [j(U)-i + J(V)-i] J(U)~i (114) 

and also by noting that J(-) is always a symmetric matrix. ■ 

The following lemma regarding the Fisher information matrix is also useful in the proof 
of Theorem |H 
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Lemma 6 Let U,Vi,V2 be random vectors such that U and (Vi,V2) are independent. 
Moreover, let Vi, V2 be Gaussian random vectors with covariance matrices ^ Si ^ S2- 
Then, we have 

J(U + V2)-'-S2^ J(U + Vi)-^-Si (115) 

Proof: Without loss of generality, let V2 = Vi + Vi such that Vi is a Gaussian random 
vector with covariance matrix S2 — Si, and independent of Vi. Due to the second part of 
Corollary [2], we have 

J(U + V2) = J(U + Vi + Vi) ^ [J(U + Vi)-i + J(Vi)-'] ~' (116) 

= [J(U + Vi)-i + S2 - Si] (117) 

which is equivalent to 

J(U + V2)~^ ^ J(U + Vi)-i + S2-Si (118) 

which proves the lemma. ■ 

Moreover, we need the relationship between the Fisher information matrix and the dif- 
ferential entropy, which is due to [31]. 

Lemma 7 ([31]) Let X and N be independent random vectors, where N is zero-mean Gaus- 
sian with covariance matrix Tin y 0, and X has a finite second order moment. Then, we 
have 

Ve^/i(X + N) = ^J(X + N) (119) 
5.3 Proof of Theorem [4] 

To prove Theorem HJ we first consider the following expression 

h(X + Nz)-h(X + N2) (120) 
which is bounded due to the covariance constraint on X. In particular, we have 

I 'istSi - ^ " + N2) < ^ log ^ (121) 

To see this, define N which is Gaussian with covariance matrix S^ — S2, and is independent 
of N2 and X. Thus, without loss of generality, we can assume Z = X + N2 + N. Then, the 
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left-hand side of fll2ip can be verified by noting that 

h{X + Nz) - h(X + N2) = /(N; X + N2 + N) (122) 

and then by using Lemma HI The right-hand side of fll2ip follows from 

/z(X + N^) - /i(X + N2) = /(N; X + N^) (123) 

= /i(N) - /i(N|X + Nz) (124) 

< /i(N) - /i(N|X + Nz,X) (125) 

= h{N) - h{N\Nz) (126) 

= /(N;Nz) (127) 

= ilog^ (128) 



2 " S 



2 



where (I125P comes from the fact that conditioning cannot increase entropy, and fll26p is 
due to the fact that X and (N2, N) are independent. Thus, we can fix the difference of the 
differential entropies in fll2ip to an a in this range, i.e., we can set 

/i(X + N^) - /i(X + N2) = a (129) 

where a G [i log |S + Sz|/|S + S2I, ^ log |Sz|/|S2|] • We now would like to understand how 
the constraint in (11290 affects the set of admissible random vectors. For that purpose, we 
use Lemma [71 and express this difference of entropies as an integral of the Fisher information 



a = /i(X + Nz) - h(X + N2) = ^ / J(X + N)ciS7v (130) 



2 



matri: 



2 

Using the stability of Gaussian random vectors, we can express J(X -|- N) as 

J(X + N) = J(X + N2 + N) (131) 

where N is a zero-mean Gaussian random vector with covariance matrix — S2 ^ 0, and 
is independent of N2. Using the second part of Corollary [2] in fll3ip . we get 

J(X + N) = J(X + N2 + N) ^ [J(X + N2)-' + J(N)-^] (132) 

= [J(X + N2)-' + - S2] (133) 



■^The integration in (|130p . i.e., J(-)(iS, is a line integral of the vector-valued function J(-). Moreover, 

since J(-) is the gradient of a scalar field, the integration expressed in J^^ J{-)dT, is path-free, i.e., it yields 
the same value for any path from S2 to Hz- This remark applies to all upcoming integrals of J(-). 
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where we used the fact that J(N) = (Sat — 5]2)~^ which is a consequence of the first part of 
Lemma [5] by noting that N is Gaussian. We now bound the integral in (11301) by using fll33p . 
For that purpose, we introduce the following lemma. 

Lemma 8 Let Ki, K2 he positive semi- definite matrices satisfying ^ Ki ^ K2, and f (K) 
he a matrix-valued function such that f (K) ^ for Ki ^ K ^ K2. Then, we have 

K2 

f (K)rfK > (134) 



Ki 



Proof: The integral is equivalent to 



[ 'f(K)rfK= / 1^ [f(Ki + t(K2-Ki)) 0(K2-Ki)] Irft 
Jki Jo 



(135) 



where denotes the Schur (Hadamard) product, and 1 = [1 • • • 1]^ with appropriate size. 
Since the Schur product of two positive semi-definite matrices is positive semi-definite [40], 
the integrand is non-negative implying the non-negativity of the integral. ■ 
In light of this lemma, using (I133p in (11301) . we get 



a<l- I [J(X + N2)-^ + S^-S2] 'rfS^ (136) 



1 |J(X + N2)-^ + Sz-S2| 



2^°^ |J(X + N2)-^| ^'"'^ 

where we used the well-known fact that Vs log |S| = for S 0. We also note that the 
denominator in fll37p is strictly positive because 

J(X + N2)-^ ^ J(N2)-^ = S2 ^ (138) 

which implies |J(X + N2)"^| > 0. 

Following similar steps, we can also find a lower bound on a. Again, using the stability 
of Gaussian random vectors, we have 

J(X + Nz) = J(X + N + N) (139) 

where N, N are zero-mean Gaussian random vectors with covariance matrices Sat, — Sjv, 
respectively, S2 ^ Sat ^ S^, and they are independent. Using the second part of Corollary[2] 
in ffT39D yields 

J(X + Nz) = J(X + N + N) ^ [J(X + N)-i + J(N)"^] (140) 

= [J(X + N)-i + - Sjv] (141) 
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where we used the fact that J(N) = (S^ — Sjv) ^ which follows from the first part of 
Lemma [5] due to the Gaussianity of N. Then, fll4ip is equivalent to 

J(X + N^)-i ^ J(X + N)-i + S2-S^ (142) 

and that implies 

[j(X + N^)-i + S;v-S2]"' ^ J(X + N) (143) 
Use of Lemma [8] and ffTiSD in (fT30|l yields 

a> f ' [J(X + Nz)-' + S;v - ^z] '^d^N (144) 
1, |J(X + N2)-i| , , 

where we again used Vslog|S| = for S 0. Here also, the denominator is strictly 
positive because 

J(X + Nz)-i + E2 - ^ J(N2)~' + S2 - = S2 ^ (146) 

which implies | J(X + N^)^^ + S2 — Sz| > 0. Combining the two bounds on a given in ( 1137P 
and (11451) yields 

1 |J(X + Nz)"^| ,1, |J(X + N2)-^ + Sz-S2| 

2 |J(X + N,)-i + S2-S,| ^ " ^ 2 |J(X + N2)-^| ^'"'^ 

Next, we will discuss the implications of (11471) . First, we have a digression of technical 
nature to provide the necessary information for such a discussion. We present the following 
lemma from [40]. 

Lemma 9 ([40], Theorem 7.6.4, page 465) Let A, B G M„, where Mn is the set of all 

square matrices of size n x n over the complex numbers, be two Hermitian matrices and 
suppose that there is a real linear combination of A and B that is positive definite. Then 
there exists a non-singular matrix C such that both C^AC and C^BC are diagonal, where 
{■)^ denotes the conjugate transpose. 

Lemma 10 Consider the function 

r(t) = - log < t < 1 (148) 

2 ^ |A + tA| ' - - ^ ' 

where A, B, A are real, symmetric matrices, and A :^ 0, B ^ 0, A ^ 0. The function r{t) 
is continuous and monotonically decreasing in t. 
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Proof: We first define the function inside the log(-) as 

We first prove the continuity of r{t). To this end, consider the function 

^(t) = |E + tA|, 0<t<l (150) 

where E )^ is a real, symmetric matrix. By Lemma [H there exists a non-singular matrix 
C such that both C^EC and C'''AC are diagonal. Thus, using this fact, we get 

g{t) = |C-^C^ECC-^ + tC-^C^ACC-^l (151) 
= |C"^| |C^EC + tC^AC| |C"^| (152) 

^ 'C^EC + tC^ACl (153) 



|C 
1 



2 



iBE + fD^l (154) 

where f ll52p follows from the fact that |AB| = |A||B|, fll53p comes from the fact that 
|C"^| = |C"^| = 1/|C|, and in (11541) . we defined the diagonal matrices D^; = C^EC, Da = 
C^AC. Let the diagonal elements of and Da be {dE,i}'^=i and {(iA,i}"=n respectively. 
Then, g(t) can be expressed as 

1 " 

9it) = T^ll{dE,i + tdA,^) (155) 



i=l 



which is polynomial in t, thus g{t) is continuous in t. Being the ratio of two non-zero 
continuous functions, f{t) is continuous as well. Then, continuity of r{t) follows from the 
fact that composition of two continuous functions is also continuous. 

We now show the monotonicity of r{t). To this end, consider the derivative of r{t) 

dritl^J_dm ..... 
dt 2 fit) dt ^ ' 

where we have /(t) > because of the facts that A :^ 0, B ^ 0, A ^ 0, and < t < 1. 
Moreover, f{t) is monotonically decreasing in t, which can be deduced from (llOOp . implying 
df(t)/dt < 0. Thus, we have dr(t)/dt < 0, completing the proof. ■ 
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After this digression, we are ready to investigate the imphcations of (1147p . For that 
purpose, let us select A, B, A in r{t) in Lemma [TO] as follows 

A = J(X + N2)-^ (157) 
B = - S2 (158) 
A = J(X + Nz)-' + S2 - - J(X + N2)-' (159) 

where clearly A :^ 0, B ^ 0, and also A ^ due to Lemma El With these selections, we 
have 

.r,. 1, |J(X + N2)-^ + S^-S2| 

= 2 |J(X + N2)~^| ^'''^ 

^ |J(X + N^)'^| 

'^'^ = 2 |J(X + N.)-^ + E2-E.| ^'''^ 

Thus, ( 11471) can be expressed as 

r(l)<a<r(0) (162) 

We know from Lemma [TO] that r{t) is continuous in t. Then, from the intermediate value 
theorem, there exists a t* such that r{t*) = a. Thus, we have 

1 |A + t*A + - S2I , , 
a = r(t*) = - log- — , , /, 163 

^ ^ 2 ^ |A + t*A| ^ ^ 

^l^C + S^ (164) 

2 ^|K* + S2| ^ ' 

where K* = A + t*A — E2. Since < t* < 1, K* satisfies the following orderings, 

J(X + N2)-^ - S2 ^ K* ^ J(X + N^)-' - (165) 

which in turn, by using Lemma [5] and Corollary [2l imply the following orderings, 

K* y J(X + N2)"' - S2 ^ J(N2)"' - S2 = S2 - S2 = (166) 
K* ^ J(X + Nzy^ - Sz ^ Cov(X) + - = Cov(X) ^ S (167) 

which can be summarized as follows, 

^ ^ S (168) 
In addition, using Lemma [6] in fll65p . we get 

K* ^ J(X + N)-^ - (169) 
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for any Gaussian random vector N such that its covariance matrix satisfies "Sn ^ "^2- The 
inequahty in fll69p is equivalent to 

J(X + N) ^ (K* + S^)~S for Sjv^Sa (170) 

where N is a Gaussian random vector with covariance matrix S^v- 
Returning to the proof of Theorem IH we now lower bound 

/i(X + Nz) - (X + Ni) (171) 

while keeping 

/^(X + Nz) - (X + N2) = a = ^ log ^i^l^ (172) 

The lower bound on (11711) can be obtained as follows 

1 '■^^ 



/i(X + Nz) - /i(X + Ni) = - / J(X + N)rfS,v (173) 



= ^ f ' J(X + N)dS^ + i / " j(x + N)di:^ (174) 

= J(X + NME. + -logL_±_^ (175) 

>i£V + S.)-<<S„ + llog|;±|^ (176) 

2 ^|K* + Si| ^ ' 

where (11741) follows from the fact that the integral in (11731) is path- independent, and (11761) 
is due to Lemma [Eland ( ]170p . 

Thus, we have shown the following: For any admissible random vector X, we can find a 
positive semi-definite matrix K* such that ^ K* ^ S, and 

h{X + Nz) - (X + N2) = i log ^^1^ (179) 

and 

MX + Nz) - MX + NO > ^ log ^i^l^ (180) 
which completes the proof of Theorem HI 
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5.4 Proof of Theorem [5] 

We now adopt the proof of Theorem H] to the setting of Theorem [5] by providing the con- 
ditional versions of the tools we have used in the proof of Theorem HI Main ingredients 
of the proof of Theorem H] are: the relationship between the differential entropy and the 
Fisher information matrix given in Lemma U\ and the properties of the Fisher information 
matrix given in Lemmas [5|, [6] and Corollary [2l Thus, in this section, we basically provide 
the extensions of Lemmas El El [7] and Corollary [2] to the conditional setting. From another 
point of view, the material that we present in this section can be regarded as extending some 
well-known results on the Fisher information matrix [36,39] to a conditional setting. 
We start with the definition of the conditional Fisher information matrix. 

Definition 3 Let (U, X) be an arbitrarily correlated length-n random vector pair with well- 
defined densities. The conditional Fisher information matrix of~K. given U is defined as 

J(X|U) = E [p(X|U)p(X|U)T (181) 

where the expectation is over the joint density /(u, x), and the conditional score function 
/9(x|u) is 



p(x|u) = Vlog/(x|u) 



glog/(x|u) glog/(x|u) "''^ 

dxi dxn 



(182) 



The following lemma extends Stein identity [36, 39] to a conditional setting. We provide 
its proof in Appendix lAl 

Lemma 11 (Conditional Stein Identity) Let U,X be as specified above. Consider a 
smooth scalar-valued function of ^, gijx.), which well-behaves at infinity in the sense that 

lim (7(x)/(x|u) = 0, t = l,...,n (183) 

For such a g{-x), we have 

E [(7(X)p(X|U)] = -E [V^7(X)] (184) 

The following implications of this lemma are important for the upcoming proofs. 
Corollary 3 Let U, X be as specified above. 

1. E[p(X|U)]=0 

2. E [Xp(X|U)T] = -I 
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Proof: The first and the second parts of the corollary follow from the previous lemma by 
selecting g{-x) = 1 and gix.) = Xi, respectively. ■ 

We also need the following variation of this corollary whose proof is given in Appendix [Bl 



Lemma 12 Let U, X be as specified above. Then, we have 



1. E[p(X|U)|U] = 0. 



2. Let g{u) be a finite, scalar-valued function ofu. For such ag{u), we have 



E[5(U)p(X|U)]=0 



(185) 



3. Let E [K\\J] be finite, then we have 



E [E[X\V]p{X\Vf] = 



(186) 



We are now ready to prove the conditional version of the Cramer-Rao inequality, i.e., the 
generalization of the first part of Lemma [5] to a conditional setting. 

Lemma 13 (Conditional Cramer-Rao Inequality) Let U, X be arbitrarily correlated 
random vectors with well-defined densities. Let the conditional covariance matrix of X be 
Cov(X|U) >- 0, then we have 



which is satisfied with equality if (U, X) is jointly Gaussian with conditional covariance 
matrix Cov(X|U). 

Proof: We first prove the inequality 



J(X|U) h Cov(X|U) 



1 



(187) 





(188) 



E [p(X|U)p(X|U)^] + E p(X|U) {X-E [X|U] ) Cov(X|U) 



1 



+ Cov(X|U)~^^ [(X - E [X|U] )p(X|U)^] 

+ Cov(X|U)-^E [(X - E [X|U] ) (X - ^ [X|U] Cov(X|U) 



1 



(189) 



J(X|U) + E p(X|U)(X-^[X|U]) Cov(X|U) 



-1 



+ Cov(X|U)-^E [(X - E [X|U] )p(X|U)T] + Cov(X|U) 



(190) 
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where for the second equahty, we used the definition of the conditional Fisher information 
matrix, and the conditional covariance matrix. We note that 



{E[{X- E[X\U])p(X\Uy]y = E p(X|U)(X-^[X|U])^ (191) 

= E [p(X|U)X^] - E \p{'X\V)E [X|U]^j (192) 

= E [p(X|U)X^] (193) 

= -I (194) 

where fll93p is due to the third part of Lemma [121 (11941) is a result of the second part 
of Corollary [31 Using f[T94D in f[T90l) gives 

^ J(X|U) - Cov(X|U)-i - Cov(X|U)-i + Cov(X|U)-^ (195) 

which concludes the proof. 

For the equality case, consider the conditional Gaussian distribution 



(196) 



/(x|u) = Cexp (^-^(x - E [X|U = u] )^Cov(X|U)-i (x - E [X|U = u] )^ 

where C is the normalizing factor. The conditional score function is 

p(x|u) = -Cov(X|U)"^ (x - E [X|U = u] ) (197) 

which implies J(X|U) = Cov(X|U)-\ ■ 

We now present the conditional convolution identity which is crucial to extend the second 
part of Lemma [5] to a conditional setting. 

Lemma 14 (Conditional Convolution Identity) LeiX,Y,U be length-n random vec- 
tors and let the density for any combination of these random vectors exist. Moreover, let X 
and Y be conditionally independent given U , and let W be defined as W = X + Y. Then, 
we have 

p(w|u) =E[p(X|U = u)|W = w,U = u] =E[p(Y|U = u)|W = w,U = u] (198) 

The proof of this lemma is given in Appendix [O We will use this lemma to prove the 
conditional Fisher information matrix inequality, i.e., the generalization of the second part 
of Lemma [5l 

Lemma 15 (Conditional Fisher Information Matrix Inequality) Let X, Y, U be as 
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specified in the previous lemma. For any square matrix A, we have 

J(X + Y|U) ^ AJ(X|U)AT + (I - A) J(Y|U) (I - A)^ (199) 

The proof of this lemma is given in Appendix [Dl The following implications of Lemma [15] 
correspond to the conditional version of Corollary [2l 

Corollary 4 Let X, Y, U be as specified in the previous lemma. Then, we have 

1. J(X + Y|U) ^ J(X|U) 

2. J(X + Y|U) ^ [J(X|U)-i + J(Y|U)-i]"^ 

Proof: The first part of the corollary can be obtained by setting A = I in the previous 
lemma. For the second part, the selection A = [J(X|U)^^ + J(Y|U)~^] ^ J(X|U)~^ yields 
the desired result. ■ 

Using this corollary, one can prove the conditional version of Lemma [6] as well, which is 
omitted. So far, we have proved the conditional versions of the inequalities related to the 
Fisher information matrix, that were used in the proof of Theorem |H To claim that the 
proof of Theorem H] can be adapted for Theorem [Sj we only need the conditional version of 
Lemma [3 In [31], the following result is implicity present. 

Lemma 16 Let (U, X) be an arbitrarily correlated random vector pair with finite second 
order moments, and be independent of the random vector N which is zero-mean Gaussian 
with covariance matrix y 0. Then, we have 

Vs^/i(X + N|U) = ^J(X + N|U) (200) 

Proof: Let Fu{u) be the cumulative distribution function of U, and /(x + n|U = u) be the 
conditional density of X + N which is guaranteed to exist because N is Gaussian. We have 



Vs^/i(X + N|U) = Vs^ J /i(X + N|U = u)dFuiu) (201) 

= y"vs^/i(X + N|U = u)rfF^(u) (202) 

= E[Vlog/(X + N|U = u)Vlog/(X + N|U = u)^] dFuiu) 

(203) 

= ^i?[Vlog/(X + N|U)Vlog/(X + N|U)^ (204) 
= ^J(X + N|U) (205) 
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where in fl202p . we changed the order of integration and differentiation, which can be done 
due to the finiteness of the conditional differential entropy, which in turn is ensured by the 
finite second-order moments of (U, X), f l203p is a consequence of Lemma [3 and (12051) follows 
from the definition of the conditional Fisher information matrix. ■ 

Since we have derived all necessary tools, namely conditional counterparts of Lem- 
mas [5|, [6|, [7] and Corollary [2l the proof Theorem H] can be adapted to prove Theorem [51 



5.5 Proof of Theorem [3] for Arbitrary K 

We now prove Theorem [3] for arbitrary K. To this end, we will mainly use the intuition 
gained in the proof of Theorem H] and the tools developed in the previous section. The only 
new ingredient that is needed is the following lemma whose proof is given in Appendix [El 

Lemma 17 Let (V, U, X) he length-n random vectors with well-defined densities. Moreover, 
assume that the partial derivatives o//(u|v,x) with respect to Xi, i = 1, . . . ,n, exist and 
satisfy 

^/(u|x,v) 



max 

l<i<n 



dXi 



< 9{^) (206) 



for some integrable function g{u) . Then, if (V, U, X) satisfy the Markov chain V U ^ X, 
we have 

J(X|U) h J(X|V) (207) 

We now start the proof of Theorem [31 for arbitrary K. First, we rewrite the bound given 
in Theorem m for the Kth user's secrecy rate as follows 



I{Uk; Yk) - I{Uk; Z) = J(X; Y^,) - /(X; Z) - [J(X; Yk\Uk) - /(X; Z\Uk)] (20 

1 |S + S,^| 1 |S + 

— lOe: ; ; lOK ; 



< I log ^^^^ - I log - [/(X; Yk\Uk) - /(X; Z\Uk)] 



(209) 

where in (12081) . we used the Markov chain Uk — X ^ {Y^, Z), and obtained (I209P using 
the worst additive noise lemma given in Lemma [H Moreover, using the Markov chain 
Uk X ^ Yx Z, the other difference term in (I209p can be bounded as follows. 



< /(X; Yk\Uk) - /(X; Z\Uk) < /(X; Y^) - /(X; Z) (210) 

1 
2 



< I log ^^TE^ - ^ log ^^^R^ (211) 





K 
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The proofs of Theorems H] and [5] reveal that for any value of I(X.;Yk\Uk) — T^IUk) in 
the range given in (12111) . there exists positive semi-definite matrix Kx such that 



J(X + N;^|f/A')"' -Sa' ^ Kx ^ S (212) 

and 

/(X; Y,AU,)^ HX; Z|C/.) = i log - i log %^ (213) 

/ \2jk\ ^ \^Z\ 

/(X; Y,_,\U,) - /(X; Z\U,) < \ log - \ log (214) 

Using (12131) in (I209P yields the desired bound on the i^th user's secrecy rate as follows 

Rk<- log 4^±^ - 1 log l^ + ^^l (215) 

We now bound the {K — l)th user's secrecy rate. To this end, first note that 

Rk^i < I{Uk-i;Yk-i\Uk) - I{Uk-uZ\Uk) (216) 
= /(X; Yk-i\Uk) - /(X; Z\Uk) - [/(X; Yk-i\Uk-i) - /(X; Z\Uk-i)] (217) 

< ^og^^^%^ - ilog^^^ - [/(X; Y^_,|t/^-_,) - /(X;Z|f/,,_0] 



2 2 iS^i 



(218) 



where in order to obtain (I217p . we used the Markov chain Uk Uk-i ^ X ^ (Yk~i, Z), 
and fl218p comes from 02141) . Using the Markov chain Uk Uk-i ^ X ^ Yk-i Z, the 
mutual information difference in (12180 is bounded as 

< /(X; Yk-i\Uk-i) - /(X; Z\Uk-i) < /(X; Yk-i\Uk) - /(X; Z\Uk) (219) 

< - log — ■ log -— (220) 

Using the analysis carried out in the proof of Theorem IH we can get a more refined lower 
bound as follows 

/(X; Yk-.\Uk-.) - /(X; Z|f/,_0 > \ log I J(X + N^-il^K-i)'^! 



_1, \iOL + nK-i\UK-i)-' + ^Z-^K-l\ .221) 
2 iSzl 
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Combining fl220D and fl22T]) yields 

1 \3{-K + Nk-i\Uk- 

log 



2 "|J(X + N^_i|f/x_i)-i + Sz-S^_i| 



< /(X; Yk-i\Uk-i) - /(X; Z|f/^„i) + ^ log 



2 ° S 



2 



1 , Kf(' -|- Sfr^i 
l-i^it + ^z| 

Now, using the lower bound on K^- given in fl212p . we get 

Kk h J(X + Nk\Uk)-' - (223) 
h J(X + Nk-i\Uk)-' - Si^-i (224) 

where fl224p is obtained using LemmaEl Moreover, since we have Uk — >■ Uk-i X + N/^_i, 
the following order exists 

J(X + Nk-i\Uk-i) h J(X + Nk-i\Uk) (225) 
due to Lemma [T71 Equation fl225p is equivalent to 

J(X + Nk-i\Uk-i)-' ^ J(X + Nk^,\Uk)-' (226) 
using which in (12240 . we get 

Kk h J(X + Nk-i\Uk-i)-' - ^k-i (227) 
We now consider the function 

r(t) = - log ^ < t < 1 228 

with the following parameters 

A = 3{X + Nk_^\Uk-i)-' (229) 
B = - S/<_i (230) 
A = Kk + - J(X + Nk-i\Uk-i)-' (231) 

where A ^ due to (12271) . Using this function, we can paraphrase the bound in (I222p as 

-r(0) < /(X; Y;,_i|f/;,_i) -/(X;Z|f/^_i) + 1 log < -r(l) (232) 

^ \^z\ 
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As shown in Lemma [TOl r{t) is continuous and monotonically decreasing in t. Thus, there 
exists a t* such that 

-r(r) = /(X; Yk-i\Uk-i) - /(X; Z\Uk-i) + ^ log (233) 

^ \^z\ 

due to the intermediate value theorem. Let Kx-i = A + t*A — "Ek-i, then we get 

/(X; Y,_,|f/,_0 - /(X; Z|f/,_0 = i log '^"71+'^"-^' - ^ log ^^^^ (234) 

We note that using fl234p in fl218p yields the desired bound on the {K — l)th user's secrecy 
rate as follows 

-Ki^-i < - log — log — (235) 

2 |Kx-i + Sa'-i| 2 |Kx-i + Sz| 

Moreover, since A ^ and < t < 1, Kx-i = A + t*A — S/^--! satisfies the following 
orderings 

J(X + Nk-i\Uk-i)-' - Ek-1 ^ Kk-1 ^ Kk (236) 

Furthermore, the lower bound in (12361) implies the following order 

Ki^-i h J(X + N\Uk-i)-' - En (237) 

for any Gaussian random vector N such that Sjy ^ Ek-i, and is independent of f/x_i,X, 
which is a consequence of Lemma El Using ( 1237^ , and following the proof of Theorem HI we 
can show that 

/(X; Y,.,|t/,_0 - /(X; Z\U,^,) < log _ ^ log (238) 

Thus, as a recap, we have showed that there exists K^-i such that 

J(X + ^K-ipK-iY^ - < < Kx (239) 

and 

/(X; Y,.,\U,.,) - /(X; Z\U,.,) = \ log \^--^ + ^^--\ _ \ log (240) 

^ \^K-l\ ^ l^zl 

/(X; Y,_,|t/,_0 - /(X; Z|f/,_0 < 1 log '^"7;+'^"-^' - ^ log ^^^^^ (241) 
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which are analogous to (12120 . (12131) . (1214p . Thus, proceeding in the same manner, for any 
selection of the joint distribution p{uk)p{uk-i\uk) ■ ■ ■p{'x.\u2), we can show the existence of 
matrices |Kfc}^^ such that 

= Ki ^ K2 ^ ... :<Kk:< Kk+i = S (242) 

and 

/(X; Y,\U,) - /(X; Z|[/,) = ^ log + - ^ log + , k = 2,...,K (243) 

^ l^fcl \^z\ 

/(X; Y,_i|f/,) - /(X; Z\U,) < ^ log + - ^ log k = 2,...,K + l 

^ \^k-l\ ^ \^Z\ 

(244) 

where Uk+i = (p- We now define = K^+i — K^, k = 1, . . . ,K, which yields K^+i = 
^*L]^Kj, and in particular, S = Kj. Using these new variables in conjunction with 

""^^ and fl21D results in 



Rk < nUk, Yk\Uk+i) - nUk, Z\Uk+i) (245) 

= /(X; Yk\Uk+i) - /(X; Z\Uk+i) - [/(X; Yk\Uk) - /(X; Z\Uk)] (246) 

^1, Kfc+i + Sjt 1 Kfc+i + 
< - log — — log ■ 



2 2 

= i log + - i log + (248) 

- - log - - \^F^^ (249) 

for /c = 2,. . . ,K. For A; = 1, the bound in fl244p . by setting k = 2 m the corresponding 
expression, yields the desired bound on the first user's secrecy rate 

/?i < /(X;Yi It/2) -/(X;Z If/2) (250) 

1 iKi + Sil 1 iKi + S^I , , 

< - log ' !J, ' - - log ' J , ' 251 
- 2 ^ |Si| 2 ^ |S^| ^ ' 

Since for any selection of the joint distribution p{uk)'p{.'^k-i\uk) ■ ■ •p(x|m2), we can establish 
the bounds in (12491) and (12511) with positive semi-definite matrices {Kj}^-|^ such that S = 
Yld=i -^i' union of these bounds over such matrices would be an outer bound for the 
secrecy capacity region, completing the converse proof of Theorem [3] for an arbitrary K. 
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6 Aligned Gaussian MIMO Multi-receiver Wiretap 
Channel 



We now consider the aligned Gaussian MIMO multi-receiver wiretap channel, and prove its 
secrecy capacity region. To that end, we basically use our capacity result for the degraded 
Gaussian MIMO multi-receiver wiretap channel in Section [5] in conjunction with the channel 
enhancement technique [29]. Due to the presence of an eavesdropper in our channel model, 
there are some differences between the way we invoke the channel enhancement technique 
and the way it was used in its original version that appeared in [29]. These differences will 
be pointed out during our proof. 

Given the covariance matrices {Kj}^-,^ such that — define the following 

rates. 



(vr,{K.}i,,{S.}f^,,S^)=llog 



7r(fc) 



log 



K 



(252) 



where 7r(-) is a one-to-one permutation on {1, . . . , K}. We also note that the subscript 
of Rj^^^ (^TT, {Ki}^^,{Si}^^,Sz) does not denote the fcth user, instead it denotes the 
{K — k + l)th user in line to be encoded. Rather, the secrecy rate of the kth user is given by 



(253) 



when dirty-paper coding with stochastic encoding is used with an encoding order of vr. We 
define the following region: 



7^°PC (vr, S, {S,} 



{Ri, . . . , Rk) 



K 

1=1 ' ■ 



for some {Kj}^-^ such that Kj ^ 0, i = l 



and Yld^i ^ S 



(254) 



The secrecy capacity region of the aligned Gaussian MIMO broadcast channel is given by 
the following theorem. 



Theorem 6 The secrecy capacity region of the aligned Gaussian MIMO multi-receiver wire- 
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tap channel is given by the convex closure of the following union 

U 7^°P^ (vr, S, {S,}f^, , S^) (255) 
TTsn 

where U is the set of all possible one-to-one permutations on {1, . . . , K}. 

We will show the achievability of the secrecy rates in Theorem [6] by extending Marton's 
achievable scheme for broadcast channels [41] to multi-receiver wiretap channels. For that 
purpose, we will use Theorem 1 of [42], where the authors provided an achievable region 
for Gaussian vector broadcast channels using Marton's achievable scheme in [41]. While 
using this result, we will combine it with a stochastic encoding scheme for secrecy purposes. 
To provide a converse proof for Theorem [6l we will follow the channel enhancement tech- 
nique [29] . We will show that for any point on the boundary of the secrecy capacity region, 
there exists a degraded channel such that its secrecy capacity region includes the secrecy 
capacity region of the original channel, and furthermore, the boundaries of these two regions 
intersect at this specific point. 



6.1 Achievability 

To show the achievability of the secrecy rates in Theorem [6l we mostly rely on the derivation 
of the dirty-paper coding region for the Gaussian MIMO broadcast channel in Theorem 1 
of [42]. We employ the achievable scheme in [42] in conjunction with a stochastic encoding 
scheme due to secrecy concerns. Without loss of generality, we consider the identity per- 
mutation, i.e., TT^k) = k, k = 1, . . . , K. Let (Vi, . . . , Yr) be arbitrarily correlated random 
vectors such that 



(Vi,...,V 



X 



(Yi, 



(256) 



Using these correlated random vectors, we can construct codebooks |v^]^(H4, Wfc)| , 

I ' J k=l 

where Wk G {l, . . . , 2'^-^'=}, Wk e {l, . . . , 2"-^'=}, k = 1,...,K, such that each legitimate 
receiver can decode the following rates 



Rk + Rk 



Ell K. + 



1,...,K 



(257) 



for some positive semi-definite matrices {Kj}^^ such that Efc^i -^fc ^ S [42]. The messages 

C ~ > K 

|VFfc|^__^ do not carry any information, and their sole purpose is to confuse the eavesdrop- 
per. In other words, the purpose of these messages is to make the eavesdropper spend its 
decoding capability on them, preventing the eavesdropper to decode the confidential mes- 
sages {Wk}^^i- Thus, we need to select the rates of these dummy messages {Rk}^^^ as 
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follows 




Ell K. + 



K 



(258) 



To achieve the rates given in fl257p . {V^}^^^ should be taken as jointly Gaussian with 
appropriate covariance matrices. Moreover, it is sufficient to choose X as a deterministic 
function of {Y k]^=i-, and the resulting unconditional distribution of X is also Gaussian with 
covariance matrix Yl!k=i [42] . 

To complete the proof, we need to show that the above codebook structure fulfills all of 
the secrecy constraints in ([1]). To this end, we take a shortcut, by using the fact that, if a 
codebook satisfies 



then it also satisfies all of the remaining secrecy constraints in ([T]) [11]. Thus, we only 
check 




(259) 



k=l 



-H{Wi, WkIZ"") = -H{Wi, ...,Wk, Z") - -/J(Z") (260) 

n n n 

= -Hivi„ . . . , v^^i, w,,..., Wk, z") - -Hivi„ . . . , y'i,\w,, ...,Wk, Z") 




- -H(yi„ V^^Jiyi, ...,Wk, TI") - -E{TI^) (262) 
> -//(V^ 1, . . . , J - -/(V^, . . . , Z") - -Hi\-,,, . . . , ilVTi, . . . , ly^, Z") 



(263) 



We will treat each of the three terms in fl263p separately. Since (V"^, . . . , V^^) can take 
2"Efe=i {nk+Rk) values uniformly, for the first term in fl263p . we have 




(264) 



k=l 



k=l 
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The second term in (12630 can be bounded as 



-J(V" , . . . , V^, ; Z") < /(Vi,i, . . . , Vk,i; Z) + 6„ 



n 



< /(X; Z) + e. 



log 



Ef=i K, + 



(265) 
(266) 

(267) 



where e„ — > as n ^ oo. The first inequahty can be shown following Lemma 8 of [1], the 
second inequality follows from the Markov chain in (12561) . and the equality in (126 7p comes 
from our choice of X, which is Gaussian with covariance matrix K^. We now consider 

the third term in (12631) . First, we note that given (Wi = wi, . . . , Wk = wk), • • • , ^k,i) 

can take 2"^*=!^'= values, where ^^=1 is given by 



K 



k=l 



k = - log 



(268) 



using our selection in (I258p . Thus, (I268P implies that given (Wi = wi, . . . , Wk = wk), the 
eavesdropper can decode (V^^^, . . . , with vanishingly small probability of error. Hence, 

using Fano's lemma, we get 



-H(yi„...,Yijw,,...,WK,z-)<- 

n n 



1 + 7n 



(269) 



where 7„ ^ as n ^ oo. Thus, plugging fl2Mll . (12671) and fl269|l into fl263D yields 

1 ^ 
hm -H{W,,...,WK\T')>y^Rk 



(270) 



k=l 



which ensures that the rates 



-2 log 



Ell K. + 



K 



(271) 



can be transmitted in perfect secrecy. 
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6.2 Converse 



To show the converse, we consider the maximization of the following expression 



K 



^-kRk 



(272) 



k=l 



where ^k^^i k = 1, . . . , K. We note that the maximum value of (12721) traces the boundary 
of the secrecy capacity region, i.e., its maximum value for any non- negative vector [fii ... fi^] 
will give us a point on the boundary of the secrecy capacity region. Let us define 7r(-) to be 
a one-to-one permutation on {1, . . . , K} such that 



< /i7r(l) 



< . . . < 



(273) 



Furthermore, let < m < of {^k}k=i be strictly positive, i.e., /i7r(i) = . . . = finiK-m) = 
0, and HT,(^K-m+i) > 0. We now define another permutation 7r'(-) on the strictly positive 
elements of {fJ'k}^=i such that vr'(/) = it{K — m + I), I = 1, . . . ,m. Then, (I272p can be 
expressed as 



K 



K 



'^^fJ'kRk — /^7r(fc)-R7r(fc) — f^n'{k)Rn 



'(k) 



(274) 



k=l 



k=l 



k=l 



We will show that 



K 



max fikRk = max'^/x^/(fc)-R, 



■'(fe) 



(275) 



k=l 



k=l 

m 

< max 

k=l 



/^7r'(fc) 

2 



log 



-E 



fJ'TT'(k) 



log 



fc=l 



(276) 



where the last maximization is over all positive semi-definite matrices {K^/(fc)}^^ such that 
Yl^=i ^-K'ik) ^ S. Since the right hand side of (12761) is achievable, if we can show that (12761) 
holds for any non-negative vector [fii ... fix], this will complete the proof of Theorem [61 To 
simplify the notation, without loss of generality, we assume that 7c'{k) = k, k = 1, . . . ,m. 
This assumption is equivalent to the assumption that < /xi < . . . < /im, and /i^ = 0, k = 
m + l,...,K. 

We now investigate the maximization in (12761) . The objective function in (I276P is generally 
non-convex in the covariance matrices {K7r'(fc)}^-^ implying that the KKT conditions for 
this problem are necessary, but not sufficient. Let us construct the Lagrangian for this 
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optimization problem 

m m 

L Mz) = Y^ f^kRf + J2 tr(KfcMfc) + tr ( ( S - J] K, ) I (277) 



k=l 



k=l 



k=l 



where the Lagrange multiphers {Mj}™ ^ , are positive semi-definite matrices, and we 
defined {R^}^^^ as follows, 



R 



G 



log 



Ell K. + 



log 

2 ^ 



El"/ + 



fc = l,...,m (278) 



The gradient of L {{Mi}^^ , M^) with respect to for any j = 1, . . . , m — 1, is given by 



-1 



'k-l 



-1 



fc=j 



-1 



-Ef Ek-h-s^ 

+ M. - Mz 



Ef IEk-h-s^ 



'k-l 



-1 



(279) 



and the gradient of L ({Mj}^-^ , M^) with respect to is given by 



-1 



-1 



Vk„l({m,}™,,m^) = ^(5^k, + s^) -^(Ek^ + ^^I +m™-m 



The KKT conditions are given by 



Vk,L({M,}::,,M^)=0, j = l,...,m 



tr(K,M,)=0, j = l,... 



m 



tr 



(280) 



(281) 
(282) 

(283) 



k=l 



We note that since tr(KjMj) = tr(MjKj), and M^- >z 0, K^- ^ 0, we have M^Kj = KjMj 
0. Thus, the KKT conditions in (12821) are equivalent to 



M,K,- = K,M,- = 0, j = 1, . . . , m 



(284) 
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Similarly, we also have 

(m \ / m \ 

^-Yl^A = P-5Z^M^^ = (285) 

Subtracting the gradient of the Lagrangian with respect to Kj+i from the one with respect 
to Kj, for j = 1, . . . , m — 1, we get 

Vk,L ({M,}:^, , Mz) - Vk,+,L ({M,}:^, , Mz) 

k \ m /k-1 ^ ~^ 



k 



5:k.+s. -Ey(Ek.+s 

k=j \i=l / k=j+l \i=l 

m / k \^^m /k-l \ 

-Ey + +Ef EK. + S, +M,-M, 

k=j \i=l / k=j+l \i=l / 

m / k m /k-l ^ ~^ 

-Ey Ek. + s. +Ey EK.+S. 

k=j+l \i=l J k=j+2 \i=l 

m / k \^^m /k-l \ 

+ Ef E^. + S- -Ef E^^ + S- -M,,, + M, (286) 

fc=j+l \i=l / k=j+2 \i=l ) 



5^K, + Sz +^ 5^K, + Sz +M,-M,+i (287) 



2 \ ^ y 2 , 



Thus, using fl28D . fl285|) . fl287l) . we can express the KKT conditions in fl28T|) . fl282D . fl283D as 
follows 

/ii Ki + Sj- j + (/ij+i - ^j) Ki + j + Mj- = /ij+i Ki + Sj+i 

+ Mj+i, j = l,...,m-l 

(288) 

J]Ki + S^J +M^ = /i„^Ki + S2j +Mz 

(289) 

KjMj = MjKj =0, J = 1, . . . , m (290) 



m 



m 



S-J^Kj = iS-J2^k]Mz = (291) 



fc=l / \ k=l 
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where we also embed the multiphcations by 2 into the Lagrange multiphers. 

We now present a lemma which will be instrumental in constructing a degraded Gaus- 
sian MIMO multi-receiver wiretap channel, such that the secrecy capacity region of the con- 
structed channel includes the secrecy capacity region of the original channel, and the bound- 
ary of the secrecy capacity region of this constructed channel coincides with the boundary of 
the secrecy capacity region of the original channel at a certain point for a given non-negative 
vector [/ii ... ^k]- 



Lemma 18 Given the covariance matrices {Kj}JLi satisfying the KKT conditions given in 
^28^) - ^29l\) . there exist noise covariance matrices such that 

1. ^ Sj, j = 1, . . . ,m. 

2. ^ El ^ . . . ^ S„ ^ 

for j = 1, . . . , m — 1, and 

/or j = 1, . . . ,m 

5. (S + S„) (e,=i K, + tm) = (S + Ez) (EI^i + Sz)-' 
The proof of this lemma is given in Appendix [F] 

Without loss of generality, we have already fixed [fii ... fix] such that < /ii < . . . < fim, 
and fik = 0, k = m + 1, . . . , K for some < m < i^. For this fixed [fii ... fix], assume 
that {K^l^j^ achieves the maximum of (12761) . Since these covariance matrices need to 
satisfy the KKT conditions given in f l288p -( l29Ti) . Lemma [18] ensures the existence of the 
covariance matrices that have the properties listed in Lemma [T8l Thus, we can 

define a degraded Gaussian MIMO multi-receiver wiretap channel that has the following 
noise covariance matrices 

1 < k < m , , 

- - 292 
m + 1 < k < K 

where < ak-m < 1 are chosen to satisfy ak-m'^i ^ for k = m + 1, . . . , K, where the 
existence of such {«fc-m}fcLm+i ensured by the positive definiteness of {S^}^^. The 
noise covariance matrix of the eavesdropper is the same as in the original channel, i.e., S^. 
Since this channel is degraded, its secrecy capacity region is given by Theorem [31 Moreover, 
since S^^Sfc, k = 1, . . . , K, and the noise covariance matrices in the constructed degraded 
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channel and the original channel are the same, the secrecy capacity region of this degraded 
channel outer bounds that of the original channel. Next, we show that for the so-far fixed 
[fii ... fix], the boundaries of these two regions intersect at this point. For this purpose, 
reconsider the maximization problem in (12721) 



K 



max /ifc-Rfc = max /ife-Rjt 



fc=i 



k=l 



< max 

KitO, i=l,...,K 



log 



-log 



Ell K. + E 



K 

i=m+l 



K, 



El"/ K, + E 



K- 



max 7 

Ki^O, i=l,...,m 



2 



log 



Ell K. + 



El/ K, + 



log 



Ell K. + 



El/ K, + 



(293) 



(294) 



(295) 



where ( 12931) is implied by the fact that for the fixed [/ii ... /j^k] , we assumed that /i^ = 
0, k = m + 1, . . . ,K and < /ii < . . . < fim, (I294p follows from the facts that the constructed 
degraded channel includes the secrecy capacity region of the original channel, and the secrecy 
capacity region of the degraded channel is given by Theorem [31 The last equation, i.e., (I295p . 
comes from the fact that, since /i^ = 0, k = m + 1, . . . , K, there is no loss of optimality in 
choosing = 0, k = m + 1, . . . , K. We now claim that the maximum in (I295p is achieved 
by {K^}'^]^. To prove this claim, we first define 



Ell K* + 



El/ K* + 



^log 



Ell k: + 



El/ K* + 



l,...,m 



(296) 



and 



Rk = - log 



Ell + 



El/K. 



- o log 



Ell + 



El/ K, + 



, m 



(297) 



for some arbitrary positive semi-definite matrices {Kj}^^ such that Eli — prove 
that the maximum in fl295p is achieved by {K^j^j^, we will show that 



(298) 



fc=i 



k=l 
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To this end, consider the first summation in fl298p 



k=l 



k=l 



k=2 



ylog 



log 



i=l 
k-1 

5^K* + S 



1=1 



log 
- log 



i=l 
k-1 



i=l 



f^k+1 



k=l 
m—1 



i=l 



log 



log 



k=l 
f^l 



fe+i 



-lo 



i=l 



log 



Si 



^log 



iEr=iK*+szi 



m— 1 



2 



fc=i 

m— 1 



log 



i=l 



log 



l0£ 



fc=l 



1=1 



i=l 



log 



i=l 



f^l 1 

-ylog 



Si 



log 



m—1 



iEr=iK*+s^i 



fc=i 



+ Ey'° 



m— 1 



+ E 



/^fc+l — fJ-k 



k=l 



^^l , 

-ylog 



2 
|Si| 



log 



i=l 



i=l 

t-1 



log 



k=l 



Ek 



i=l 
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Similarly, we have 



fc=i — 't— 1 



1 



log 



k=l 



i=l 

t-1 



log 



k=l 



/^ll l^l| 

log ' 

We define the following matrices 

Using (13021) . (13031) and (I304p . the difference in (I298P can be expressed as 



, m 



k=l 



k=l 



fc = ^ log 



log 



m— 1 



fc=l 

m— 1 

-E 

fc=l 

m— 1 

+ E 

k=l 



-1 



, j=l 



log 



1+ K^K* + S 



log 



-1 

z I Afc 

-1 



1+ 5^k:+s 



k+l 



,1=1 



We first note that 



s + s. 



> 



(303) 



(304) 



(305) 



(306) 



where the equality is due to the fifth part of Lemma [181 and the inequality follows from the 
fact that the function 



(307) 



is monotonically increasing in the positive semi-definite matrix A as can be deduced from 
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fllOOl) . and that Yl^i — ^- Furthermore, we have 



log 



1+ 5^K* + S 



j=i 



< log 
= log 



-1 

k I Afc 

^1 



H log 



1+ 5;]k* + s 



-1 



,i=l 



A^fc+l — f^k 



5^K* + sJ A, 



j=i 



1+ 5^K* + S 



fc+i 



, i=l 



(308) 
(309) 



where the inequality in fl308p follows from the concavity of log | ■ | in positive semi-definite 
matrices, and fl309p follows from the third part of Lemma [T8l Using (13061) and (I309P in (I305P 
yields 



(310) 



k=l 



k=l 



which implies that the maximum in 

), we get 



is achieved by {K^}^^. Thus, using this fact in 



K 



max 



k=l 



k=l 



2 



E 

k=l 



2 



log 



log 



Ell K* + 



Ell K* + 



Ell K* + 




(311) 



(312) 



where the equality follows from the fourth part of Lemma [181 Since the right hand side 
of (I312p is achievable, and we can get a similar outer bound for any non-negative vector 
[/ii ... fix] , this completes the converse proof for the aligned Gaussian MIMO channel. 



7 General Gaussian MIMO Multi-receiver Wiretap 
Channel 

In this final part of the paper, we consider the general Gaussian multi-receiver wiretap 
channel and prove its secrecy capacity region. The main idea in this section is to construct 
an aligned channel that is indexed by a scalar variable, and then show that this aligned 
channel has the same secrecy capacity region as the original channel in the limit of this 
indexing parameter on the constructed aligned channel. This argument was previously used 
in [29,30]. The way we use this argument here is different from [29] because there are no 
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secrecy constraints in [29] , and it is different from [30] because there are multiple legitimate 
receivers here. 

Given the covariance matrices {K^};^^ such that X^j^i-^^fe ^ we define the following 
rates 



rT"" mil , mil , ^z, ml, , Hz) 



^log 



H^(fc) 


(Et 


=1 


) ^i) 


+ ^n{k) 








) ^Ak) 





- log 

2 ^ 



Hz (eIiK.(,))hJ + Sz 



TrCO j H^ + 



k^l,...,K (313) 



where 7r(-) is a one-to-one permutation on {1, . . . . K}. We also note that the subscript 
of (^TT, {K,};i, , {S,};i, , S^, {H,};i, , H^) does not denote the A;th user, instead it 

denotes the {K — k + l)th user in fine to be encoded. Rather, the secrecy rate of the A;th 
user is given by 



Rk - R^?P^k) {^i}f=i > > ^z, {H.i}f=i , Hz) 



(314) 



when dirty-paper coding with stochastic encoding is used with an encoding order of tt. 
We define the following region. 

7e°pc (tt, s, {i:,}l, , Ez, {H,},^, , Hz) 

k — 1, . . . ,K, for some {Kj}^^^ such that Kj 0, 
i = l,...,X, and E^iKi^S 



— s {Ri, ■ ■ ■, Rk) 



(315) 



The secrecy capacity region of the general Gaussian MIMO broadcast channel is given by 
the following theorem. 

Theorem 7 The secrecy capacity region of the general Gaussian MIMO multi-receiver wire- 
tap channel is given by the convex closure of the following union 



(316) 



Tren 



where U is the set of all possible one-to-one permutations on {1, ... , K}. 
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7.1 Proof of Theorem [7] 



Achievability of the region given in Theorem [7] can be shown by following the achievability 
proof of Theorem [6] given in Section lOl hence it is omitted. For the converse, we basically use 
the ideas presented in [29,30]. Following Section V-B of [29], we can construct an equivalent 
channel which has the same secrecy capacity region as the original channel defined in f|T3l) - 
f|T^ . In this constructed equivalent channel, all receivers, including the eavesdropper, and 
the transmitter have the same number of antennas, which is t, 



Yk = HfcX + Nfc, k 
Z = H^X + Nz 



(317) 
(318) 



where = A^V^, V^, is a t x t orthonormal matrix, and is a t x t diagonal matrix whose 
first (t — ffc) diagonal entries are zero, and the rest of the diagonal entries are strictly positive. 
Here, is the rank of the original channel gain matrix, H^. The noise covariance matrix of 
the Gaussian random vector is given by which has the following block diagonal form 











(319) 



where is of size (t — fjt) x (t — f^), and is of size x f^. 

Similar notations hold for the eavesdropper's observation Z as well. In particular, = 
A^V^ where is a t x t orthonormal matrix, and Az is a t x t diagonal matrix whose first 
(t — Tz) diagonal entries are zero, and the rest of the diagonal entries are strictly positive. 
Here, rz is the rank of the original channel gain matrix of the eavesdropper, H^. The 
covariance matrix of the Gaussian random vector is given by which has the following 
block diagonal form 











(320) 



where S;! is of size (t — fz) x (t — fz) and Sf is of size fz x fz- Since this new channel 
in (I317p - fl318l) can be constructed from the original channel in (|T3l) - f|T4l) through invertible 
transformations [29], both have the same secrecy capacity region. Moreover, these transfor- 
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mations preserve the dirty-paper coding region as well, i.e., 




1, 

2 log 



^log 





Hj + Sz 




H^ + S^ 



Hz {j2i=l 


))h5- 


f Sz 


Hz (eL/ 


))h5 

= 1,.. 


+ Sz 



We now define another channel which does not have the same secrecy capacity region or 
the dirty paper coding region as the original channel: 



Yk = HfcX + Nfc, k = 1, 
Z = HzX + Nz 



K 



(322) 
(323) 



where Hk = yAk + ctlfe j and a > 0, and 1^ is a t x t diagonal matrix whose first 
{t — Tk) diagonal entries are 1, and the rest of the diagonal entries are zero. Similarly, 
Hz = (^^z + ttizj V^, where is a t x t diagonal matrix whose first {t — fz) diagonal 
entries are 1, and the rest are zero. We note that {Hk}^^^,Hz are invertible, hence the 
channel defined by fl322p - (13231) can be considered as an aligned Gaussian MIMO multi- 
receiver wiretap channel. Thus, since it is an aligned Gaussian MIMO multi-receiver wiretap 
channel, its secrecy capacity region is given by Theorem [6l 

We now show that as a — 0, the secrecy capacity region of the channel described by 
(13221) - (I323P converges to a region that includes the secrecy capacity region of the original 
channel in (fT3|) - f|T^ . Since the original channel in (fT3|) - (fT^ and the channel in (I317p - (I318I) 
have the same secrecy capacity region and the dirty-paper coding region, checking that 
the secrecy capacity region of the channel described by (I322p -( l323l) converges, as a — > 0, 
to a region that includes the secrecy capacity region of the channel described by (13171) - 
(I318p . is sufficient. To this end, consider an arbitrary (2"^^, . . . , 2"-^^, n) code which can 
be transmitted with vanishingly small probability of error and in perfect secrecy when it 
is used in the channel given in (l317p -( l3T8l) . We will show that the same code can also be 
transmitted with vanishingly small probability of error and in perfect secrecy when it is used 
in the channel given in (I322p - (l323p as a — 0. This will imply that the secrecy capacity 
region of the channel given in ( I322p -( |323l) converges to a region that includes the secrecy 
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capacity region of the channel given in fl317l) - fl318p . We first note that 



al^VfcX 



^ k 



+ 



k 



1,...,K 



(324) 
(325) 

(326) 



where contains the first {t — r^) rows of Ijt, and contains the last rows of A^. 

is a Gaussian random vector that contains the first (t — f^) entries of N^, and is 
a vector that contains the last entries. The covariance matrices of N^, Nf are S^, Sf , 
respectively, and N^ and Nf are independent as can be observed through (13191) . Similarly, 
we can write 



Y. = A.V.X + Ni 










Af VfcX _ 


+ 





^ k 

^ k 



1,...,K 



(327) 
(328) 

(329) 



We note that Yf = Yf , k = 1, . . . , K, thus we have 



X Yfc — > Y^; 



k 



K 



(330) 



which ensures the any message rate that is decodable by the fcth user of the channel given in 
(I317l) - (l318p is also decodable by the kth. user of the channel given in fl322p - fl323p . Thus, any 
(2"-f^i, . . . , 2'^^'<,n) code which can be transmitted with vanishingly small probability of error 
in the channel defined by fl317p - fl318p can be transmitted with vanishingly small probability 
of error in the channel defined by fl322l) - fl323p as well. 



(331) 

(332) 



We now check the secrecy constraints. To this end, we note that 

Z= [kz + alz) VzX + N^ 



«i^v^x " 






Af VzX _ 


+ 


Nf 



(333) 



where contains the first {t — fz) rows of I^, and Af contains the last fz rows of A^. 
N^ is a Gaussian random vector that contains the first t — fz entries of N^, and Nf is a 
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vector that contains the last fz entries. The covariance matrices of N^, Nf are S^, Sf , 
respectively, and and Nf are independent as can be observed through (13201) . Similarly, 
we can write 



Z = AzV^X + Nz 


AfVzX 

z^ 



+ 






.Nf _ 



(334) 
(335) 

(336) 



We note that Z-^ = Z-^, and thus we have 



X 



(337) 



We now show that any (2"-^^, . . . , 2"-^^) code that achieves the perfect secrecy rates (i?i, . . . , 
Rk) in the channel given in f l317l) -( l3T8l) also achieves the same perfect secrecy rates in the 
channel given in (I322l) - (l323p when a — > 0. To this end, let iS be a non-empty subset of 
{1, . . . , K}. We consider the following equivocation 



H{Ws\t-) = H{Ws) - I{Ws; Z") 

= H{Ws\'r') + I{Ws\ Z") - I{Ws\ Z") 

= i/(1^5|Z^'", Z^'") + I{Ws] Z"^'", Z^'") - I{Ws] Z^'", Z^'") 

= //(VTslZ^'", Z^'") - I{Ws] Z^'"iZ^'") 



(338) 
(339) 
(340) 
(341) 
(342) 



where (I34ip follows from the facts that Ws and Z"^'" = N"^'" are independent, and Z^'" 
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2B,n j^Q^ bound the mutual information term in 0342p 



I{Ws; Z^'"|Z^'") < /(X"; Z^'"|Z^'") 

= /i(Z^'"|Z^'") - /i(Z^''^|Z^'",X") 
= /i(Z^'"|Z^'") - /i(Z^'"|X'^) 
< /i(Z^'") - /i(Z^'"|X") 
= /fX";Z^'") 



< 



i=l 



<J2 max /(X,;Zf) 



< 



n 

E5N 



1=1 



2 log 



(343) 
(344) 
(345) 
(346) 
(347) 

(348) 
(349) 

(350) 
(351) 



where (l343|l follows from the Markov chain W5 ^ X" -> (Z^'",Z^'"), (l345ll is due to the 
Markov chain Z^-'^ ^ X" ^ Z^'", fIMB]) comes from the fact that conditioning cannot 
increase entropy, fl348p is a consequence of the fact that channel is memoryless, f l350p is 
due to the fact that subject to a covariance constraint, Gaussian distribution maximizes the 
differential entropy. Thus, plugging (I35ip into (I342p yields 



n n 2 









^Z 





(352) 



which implies that 



lim -H{Ws\Z'') > lim -HilVslZ"") - lim - log 



*o2 



•yA 
^Z 



= lim -H{Ws\Z' 

n~*oo n 

fce5 



(353) 

(354) 
(355) 



where (13541) follows from the fact that log |a^A + B| is continuous in a for positive definite 
matrices A, B, and fl355p comes from our assumption that the codebook under consideration 
achieves perfect secrecy in the channel given in fl317p - fl318l) . Thus, we have shown that 
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if a codebook achieves the perfect secrecy rates {Ri,...,Rk) in the channel defined by 
f l317l) - fl318p . then it also achieves the same perfect secrecy rates in the channel defined by 
(13221) - ( 13231) as a ^ 0. Thus, the secrecy capacity region of the latter channel converges to a 
region that includes the secrecy capacity region of the channel in f l317l) - fl318p . and also the 
secrecy capacity region of the original channel in f|T3l) - f|T^ . Since the channel in fl322p - (13231) 
is an aligned channel, its secrecy capacity region is given by Theorem [6], and it is equal to 
the dirty-paper coding region. Thus, to find the region that the secrecy capacity region of 
the channel in (I322p -( l323l) converges to as a 0, it is sufficient to consider the region which 
the dirty-paper coding region converges to as a — * 0. For that purpose, pick the kth user, 
and the identity encoding order, i.e., n{k) = k, k = 1, . . . , K. The corresponding secrecy 
rate is 



log 



H^(fc) 




) + S7r(fc) 









log 

2 ^ 









)H5 + Sz 



log 



log 

2 ^ 



(Hz 


+ aizVz) 
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= 1 K,r(i)^ 
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+ alzVz) 


(Ef: 


/ K^(j) 


)(H. 


+ aizVzy + ^z 





+ OiiTr{k)^Tr{k)^ 


(Et 




(H^(fc) 


+ tti7r(A:)V^(fc) j + S^(fc) 




+ ai7r(fc)Vjr(fc) j 


(Ef: 


1^ K^(j)^ 


1 (H^(fc) 


+ ai7r(fc)V^(A,.) j + S,r(fc) 



(356) 



which converges to 



^log 



^n{k) 


(Et 
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(Et 


1^ K^(j) 


) Hj(fc) + S^(fe) 



^log 



Hz 




Hj + Sz 
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^Efj/K^w] 


Hj + Sz 



(357) 



as a — > due to the continuity of log | ■ | in positive semi-definite matrices. Moreover, (13571) 
is equal to 



- log 
2 ^ 



H^(fc) 


(Et 


= 1 ^TT{i) 


) Hj(fc) + 5^^(fe) 


H^(A,.) 


(Et 




) Hj(fc) + S^(fe) 



- log 

2 ^ 



Hz (j2i=l ^Ai)^ 


Hj + Sz 


Hz (Et/K^w] 


HT + Sz 



(358) 



which implies that the secrecy capacity region of the general Gaussian MIMO multi-receiver 
wiretap channel is given by the dirty-paper coding region, completing the proof. 
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8 Conclusions 



We characterized the secrecy capacity region of the Gaussian MIMO multi-receiver wiretap 
channel. We showed that it is achievable with a variant of dirty-paper coding with Gaussian 
signals. Before reaching this result, we first visited the scalar case, and showed the necessity 
of a new proof technique for the converse. In particular, we showed that the extensions 
of existing converses for the Gaussian scalar broadcast channels fall short of resolving the 
ambiguity regarding the auxiliary random variables. We showed that, unlike the stand-alone 
use of the entropy-power inequality [24,25], the use of the relationships either between the 
MMSE and the mutual information or between the Fisher information and the differential 
entropy resolves this ambiguity. Extending this methodology to degraded vector channels, 
we found the secrecy capacity region of the degraded Gaussian MIMO multi-receiver wiretap 
channel. Once we obtained the secrecy capacity region of the degraded MIMO channel, we 
generalized it to arbitrary channels by using the channel enhancement method and some 
limiting arguments as in [29,30]. 



Appendices 



A Proof of Lemma 11 



Let pj(x|u) = ^iSI^^Mli)^ i_g_^ iii^Q ^th component of p(x|u). Then, we have 



9/(x|u) 



E[g{X)p,iX\V)] = I gi^)j^f{^,u) rfx du 

df{^\n) 



-/(u) dx. du 



dxi 

, g/(x|u) ^ 
g['x.) — dXi 



dxi 



/(u) (ix du 



(359) 
(360) 
(361) 



where dx = dxi . . . dxi-idxi+i . . . dxn- The inner integral can be evaluated using integration 
by parts as 



5'(x) — — dXi = [5f(x)/(x|u 



dxi 



Xj = — 00 



/(x|u)^|^(ixi 



f{x\u)^^dxi 



(362) 
(363) 
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where (13630 comes from the assumption in (I183p . Plugging fl363p into (13610 yields 



£;[^(x)a(x|u)] 



-E 



dxi 
dxi 



/(x, u) dx du 



which concludes the proof. 



(364) 
(365) 



B Proof of Lemma \T2 



Let pj(x|u) 



91og/(x|u) 



dxi 



i.e., the ith component of p(x|u). Then, we have 



a/(x|u) 



i?[p(X|U)|U = u] = J j^^f{^\n)d^ 

^ 9/(x|u 



dxi 

where (ix~ = dxi . . . dxi^idxi^i . . . dxn- The inner integral is 

9/(x|u) 



dxj 



(ix 



dxi 



-dxi = /(x|u) 



+ 00 







(366) 
(367) 



(36^ 



since /(x|u) is a valid probability density function. This completes the proof of the first 
part. For the second part, we have 



E [(7(U)p(X|U)] = E [g{V)E[p{lL\V)\V = u]] = 



(369) 



where the second equality follows from the fact that the inner expectation is zero as the first 
part of this lemma states. The last part of the lemma follows by selecting g{lJ) = E[K\\J] 
in the second part of this lemma. 



C Proof of Lemma [141 



Throughout this proof, the subscript of / will denote the random vector for which / is the 
density. For example, /x(x|u) is the conditional density of X. We first note that 

/vi/(w|u) = / /x,vi/(x, w|u)(ix = / /x(x|u)/y(w - x|u)(ix (370) 
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where the second equahty is due to the conditional independence of X and Y given U. 
Differentiating both sides of fl370p . we get 



^/h/(w|u) 
dwi 



/x(x|u) dx 

OWi 

/x(x|u) — dx 



= [ - /x(x|u)/y(w - X|U 

where (13720 is due to 
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(374) 
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d{wi - Xi) dwi 

dfyjw - x|u) d{Wi - Xj) 

d{wi - Xi) dxi 

9/y(w - x|u) 



dxi 

and (13731) follows from the fact that /x(x|u), /y(w — x|u) vanish at infinity since 
probability density functions. Using (13741) . we get 



pi(w|uj 



9/iv(w|u) 

dwj 
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/,4/(w|u) dXi 

/x(x|u)/y(w -X|u) 



(ix 



9/x(x|u) 

dxi 



/w^(w|u) /x(x|u 
a/x(x|u) 

rfx 



-(ix 



y /x(xiu,w 



/x(x|u) 
9/x(x|u) 



/x(x|u) 



W = w,U = u 



(375) 
(376) 
(377) 
they are 

(378) 
(379) 
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(381) 



where (I380p follows from the fact that 

/x(x|u,w) 
Equation (13811) implies 



/x,vy(x, 
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u) _ /x(x 


u)/y(w - X 






u) 




u) 



(382) 



p(w|u) = E[p(X|U = u)|W = w,U = u] 



(383) 
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and due to symmetry, we also have 



p(w|u) = E[p(Y|U = u)|W = w,U = u] (384) 
which completes the proof. 



D Proof of Lemma [15] 

Let W = X + Y. We have 



0^ E 



Ap(X|U) + (I - A)p(Y|U) - p(W|U)) 



(Ap(X|U) + (I - A)p(Y|U) - p(W|U) 



(385) 



= AE [p(X|U)p(X|U)T A^ + AE [p(X|U)p(Y|U)T (I - A)^ 

- AE [p(X|U)p(W|U)T + (I - A)E [p(Y|U)p(X|U)T A^ 

+ (I - A)E [p(Y|U)p(Y|U)T (I - A)T - (I - A)E [p(Y|U)p(W|U)T 

- E [p(W|U)p(X|U)T A-'-E [p(W|U)p(Y|U)T (I - A)^ 

+ E [p(W|U)p(W|U)T (386) 

We note that, from the definition of the conditional Fisher information matrix, we have 

E [p(X|U)p(X|U)T = J(X|U) (387) 

E [p(Y|U)p(Y|U)T = J(Y|U) (388) 

E [p(W|U)p(W|U)T] = J(W|U) (389) 

Moreover, we have 

E [p(X|U)p(Y|U)T = {E [p(Y|U)p(X|U)^)^ (390) 

= {E [E [p(X|U)|U = u\E [p(Y|U)|U = u]])^ (391) 

= (392) 

where fl39ip comes from the fact that given U = u, X and Y are conditionally independent, 
and (13921) follows from the first part of Lemma [121 namely 

E [p(X|U) |U = u]=E [p(Y|U) |U = u] = (393) 
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Furthermore, we have 



E [p(X|U)p(W|U)T] =E[E [p(X|U = u)|W = w,U = u] p(W|U)^] 

= E [p(W|U)p(W|U)T 
= J(W|U) 



(394) 
(395) 
(396) 



where ( 1395^ follows from Lemma [HI and ( 1396P comes from the definition of the conditional 
Fisher information matrix. Similarly, we also have 



E [p(Y|U)p(W|U)T = E [p(W|U)p(X|U)T = E [p(W|U)p(Y|U)T 



Thus, using (13871) -(13891). (13921) . (1396D-(1397D in (13861) . we get 

^ AJ(X|U)AT - AJ(W|U) + (I - A)J(Y|U)(I - A) 
- J(W|U)A^ - J(W|U)(I - A)^ + J(W|U) 
= AJ(X|U)A^ + (I - A)J(Y|U)(I - AY - J(W|U) 

which completes the proof. 



J(W|U) 

(397) 



(I-A)J(W|U) (398) 
(399) 
(400) 



E Proof of Lemma [171 

Consider J(X|U) 

J(X|U) = J(X|U,V) (401) 

= E [Vxlog/(X|U,V)Vxlog/(X|U,V)T] (402) 

= E [Vxlog/(X,U,V)Vxlog/(X,U,V)T (403) 
= E [(Vx log /(X, V) + Vx log /(U|X, V)) 

(Vx log /(X, V) + Vx log /(U|X, V))^] (404) 

= E [Vx log /(X, V) Vx log /(X, YY] 
+ E [Vx log /(X, V) Vx log /(U|X, V)T] 
+ E [Vx log /(U|X, V) Vx log /(X, YY] 

+ E [Vx log /(U|X, V) Vx log /(U|X, V)^ (405) 

where ( 14011) is due to the Markov chain V U ^ X, ( I403p comes from the fact that 

Vx log /(x|u, v) = Vx ( log /(x, u, v) - log /(u, v)) (406) 
= Vxlog/(x,u,v) (407) 
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and fl404p is due to the fact that /(x, u, v) = /(x, v)/(u|x, v). We note that 



and 



J(X|V) = E [Vxlog/(X, V)Vxlog/(X, V)T 



E [Vxlog/(U|X, V)Vxlog/(U|X, V)T ^ 



(40^ 



(409) 



Using (IM and (H09|l in fliOSjl . we get 



J(X|U) ^ J(X|V) + E [Vxlog/(X, V)Vxlog/(U|X, V)T] 
+ i? [Vxlog/(U|X,V)Vxlog/(X,V)T 



(410) 



We now show that the cross-terms in (14101) vanish. To this end, consider the (z,j)th entry 
of the first cross-term 
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(411) 

(412) 
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(414) 



where the inner integral can be evaluated as 



<9/(u|x,v) d 

- dVL 



dxi 



dxi 



/(u|x, v) du 



dxi 







(415) 



where the interchange of the differentiation and the integration is justified by the assumption 
given in f l206p . Thus, using f l415p in f l414p implies that 



E 



Vxlog/(X,V)Vxlog/(U|X,V) 







(416) 



Thus, using f l416p in f l410p . we get 



J(X|U) h J(X|V) 



(417) 



which completes the proof. 
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F Proof of Lemma [181 



Since we assumed fij > 0, j = 1, . . . , m, we can select 



■ij+i 



-1 



J^K,, j = 0,l...,m-l (41^ 



i=l 



which is equivalent to 



fij+i 5^ + S,+i = + ^i+i + ^i+i' J = 0, 1 . . . , m - 1 (419) 



j=i 



i=l 
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and that implies z< Tij ^ j — 1, ... ,111. Furthermore, for j' = 0, . . . , m 
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where (142 ip follows from fl419p . (14230 and (I427P are consequences of the KKT conditions 
MjKj = KjMj = 0, j = 1, . . . , m. Finally, (H29D is equivalent to 



-1 X -1 

+ M,+i, j = 0,...,m-l (430) 



Plugging (HT91) and fl430D into the KKT conditions in (12881) and (12891) yields the third part 
of the lemma. 

We now prove the second part of the lemma. To this end, consider the second equation 
of the third part of the lemma, i.e., the following 

5^K, + S„j =/i^f^K, + S^j +Mz (431) 

which implies Sm ^ S^. Now, consider the first equation of the third part of the lemma for 
j = m — 1, i.e., the following 



^m—l \ ^ /m— 1 \ ^ /in—l ^ ^ 



/^m— 1 



■'m 

. i=l / \ i=l J \ i=\ 



/m-l \ ^1 

- /i^ K, + (432) 



Since the matrix on the right hand side of the equation is positive semi-definite due to the 
fact that -< S^, and we assume that /i^ > fJ^m-i, (14321) implies 



-1 \ -1 



^m-l \ ~^ /m-l \ /m-l \ /m-l 

,i=i J \i=i J \i=i J \i=i 



(433) 



which in turn implies Sm-i ^ '^m ^ S^. Similarly, if one keeps checking the first equation 
of the third part of the lemma in the reverse order, one can get 

Si ^ . . . ^ ^ (434) 

Moreover, the definition of Si, i.e., (14191) for j = 0, 



Si 



1 ^ ' 



+ — Ml 



(435) 



implies that Si :^ completing the proof of the second part of the lemma. 
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We now show the fourth part of the lemma 
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(436) 
(437) 
(438) 

(439) 
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(441) 



where fl438p follows from (143 Op and (143 9 p is a consequence of the KKT conditions K^Mj 
MjKj = 0, j = 1, . . . ,m. 
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The proof of the fifth part of the lemma follows similarly 
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where (1443 p follows from the second equation of the third part of the lemma, and (14441) is a 



consequence of the KKT condition in (I285p . completing the proof. 
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